Tuesday, November 21, 2017

Db2 12 SQL Enhancement: Temporal Logical Transactions

Temporal data support, introduced with Db2 10 for z/OS, is one of the more interesting SQL-related Db2 enhancements delivered in recent releases of the DBMS. Temporal data support comes in two flavors (which can both be utilized for a single table): business-time temporal and system-time temporal. With business-time temporal support enabled for a table, an organization can put future changes into the table (e.g., price changes for products or services that will not go into effect until some future date) without affecting programs that, by default, access data rows holding information that is currently in effect (among the use cases for business-time temporal: profitability forecasts, utilizing queries that access price values that will be in effect six months from now).

In contrast to business-time temporal, system-time temporal enables a look back, as opposed to forward, regarding in-effect data values. By that I mean that system-time temporal allows an application or a user to see data rows that were current at a time in the past. The mechanism through which this capability is provided is conceptually pretty simple: when system-time temporal support is enabled for a table (referred to as the base table), a logically equivalent history table is associated with the base table ("logically equivalent" means that the history table has the same columns as the base table: same names, same order, same data types - I've pointed out that physical equivalence is NOT a requirement). Subsequently, when a row is made non-current in the base table by way of an UPDATE or a DELETE operation, the "before" image of the updated or deleted row is stored - automatically by Db2 - in the associated history table. Db2 knows, thanks to a couple of timestamp columns in the base table (and the history table) that are maintained by Db2, when a row became current (i.e., when it was either inserted into the base table, or when it was updated) and when it became non-current (i.e., when it was deleted from the base table, or replaced in the base table via an UPDATE). With Db2 having that information, it can respond to a query that contains a temporal predicate, which would be of the form FOR SYSTEM TIME AS OF timestamp-value (or BETWEEN timestamp-value1 AND timestamp-value2 or FROM timestamp-value1 TO timestamp-value2).

OK, with that background information in mind, consider this scenario: a program inserts a row into base table T1, which has been enabled for system-time temporal (one could also say that T1 is enabled for "row versioning"), and then, in the same unit of work, updates the just-inserted row. Then the program commits. You might expect to find, following the completion of this unit of work, the "before" image of the row that was changed by the program's UPDATE statement; but, that row is nowhere to be found in T1_HIST (or whatever you decided to name T1's history table). Why is that so? It's so because the update of the row occurred in the same unit of work as the insert of the row. What can you do about that, if you want to see, in T1_HIST, the "before" image of the row changed by the aforementioned UPDATE? Well, you could break the single unit of work into two units of work, with the row-insert in one and the UPDATE of the row in the other. That might not be feasible, and it could be a big hassle even if feasible. Here's what likely would be a more attractive option: have two temporal logical transactions in the one unit of work.

Two what?

Temporal logical transactions, or TLTs, are a new capability introduced with Db2 12 for z/OS (at function level V12R1M500). How would you put this functionality to work in a program? It's pretty easy: first, you have the program issue the following SQL statement:

SET TEMPORAL_LOGICAL_TRANSACTIONS = 1

That statement, which references a new (with Db2 12) special register, tells Db2 to allow (for your session) multiple TLTs in one unit if work.

Next, the program would issue a SQL statement with this form (and note that timestamp-value1 could be CURRENT TIMESTAMP):

SET TEMPORAL_LOGICAL_TRANSACTION_TIME = timestamp-value1

Then the program would perform the row INSERT into T1 as it had before. In that row, the "start" timestamp value (the value interpreted by Db2 as the time at which the row became current) would come from the value supplied via the TEMPORAL_LOGICAL_TRANSACTION_TIME special register, which was previously set by the program.

After performing the INSERT operation, the program would again issue the statement below (and again, timestamp-value2 could be CURRENT TIMESTAMP - it would just need to be a little "ahead" of the timestamp-value1 previously used, and as the temporal timestamp value goes to the picosecond level, CURRENT TIMESTAMP, used again, would take care of that "little bit ahead" requirement):

SET TEMPORAL_LOGICAL_TRANSACTION_TIME = timestamp-value2

The program would then update the just-inserted row, as it had before. In a similar way as for the INSERT, the "start" timestamp value for the updated row in the base table would be timestamp-value2, based on the value to which the TEMPORAL_LOGICAL_TRANSACTION_TIME special register was set just prior to the UPDATE.

Then the program would issue a commit, as it had before, and now what would one see in T1_HIST? One would see the "before" image of the row updated within the just-completed unit of work, even though the UPDATE targeted a row that had been inserted into T1 in the same unit of work. That's temporal logical transaction functionality in action: you have inserted multiple TLTs into one physical transaction, giving you the row-change-history capability you want even for a row acted on more than once in a unit of work, without having to change the commit scope of the unit of work.

AND, that's not all - TLT functionality can also work in the other direction, allowing you to incorporate several physical transactions into one temporal logical transaction. How is that done? It's not hard: just have your program issue the statement below (keeping in mind that, as previously noted, timestamp-value can be CURRENT TIMESTAMP):

SET TEMPORAL_LOGICAL_TRANSACTION_TIME = timestamp-value

Following issuance of that statement, the program could update a row in system-time-enabled base table T1, then commit, then update the same row, then commit again. What you'd see in T1 and T1_HIST in this case (sticking with the same base and history table names used above) would be what you'd see if the updates had been made in one physical transaction versus the actual pair of physical transactions: one "before" row image captured in T1_HIST, showing the target row as it appeared prior to the first update performed by the program, and the row in T1 as it appeared after the second update performed by the program. Why no second row in T1_HIST, reflecting the change made by the second UPDATE operation? Because you told Db2 (through specifying a value for the TEMPORAL_LOGICAL_TRANSACTION_TIME special register) that you wanted the two physical transactions treated as one from a system-time temporal perspective.

And there's your overview of temporal logical transaction functionality. Having that Db2 12 capability on-hand in your shop could well open up new use cases for system-time temporal data support. Give it some thought.

Friday, October 27, 2017

Db2 12 for z/OS: Enhanced Lock Avoidance in Data Sharing Environments

In the last entry posted to this blog, I described the mechanisms (including PUNC bits and commit log sequence numbers, aka CLSNs) used by Db2 for z/OS for lock avoidance purposes (lock avoidance here refers to a capability whereby Db2 can, in many cases, verify the committed state of a data value without having to get an S-lock - i.e., a read lock - on the page or row in which the value is located). As lock and unlock requests consume CPU cycles, lock avoidance is a CPU efficiency booster for Db2 application workloads.

Db2 data sharing (a technology through which multiple Db2 subsystems in a Parallel Sysplex mainframe cluster share concurrent read/write access to one database) has long delivered unmatched levels of system scalability and availability. An enhancement delivered with Db2 12 for z/OS can reduce the CPU cost of data sharing by increasing the degree to which locks can be avoided in a data sharing environment.

In the aforementioned blog entry on Db2 lock avoidance, I described how commit log sequence numbers (CLSNs), which are maintained for each and every Db2 page set (or partition, in the case of a partitioned object), are used in support of lock avoidance. When Db2 data sharing technology was introduced in the mid-1990s (with Db2 Version 4), its use had the effect of reducing lock avoidance because a single CLSN, called the global CLSN (or GCLSN), was used for ALL group buffer pool-dependent data sets (i.e., data sets that are the object of inter-Db2 read/write interest in a data sharing group - quite likely the majority of open data sets in the system). The GCLSN value is the log point at which the currently-still-outstanding-longest-running-data-changing unit of work  got started. The implication: if ONE data-changing unit if work is running for a long time, lock avoidance for ALL group-buffer-pool-dependent data sets is negatively impacted, because in that case the GCLSN value will go a long time without being positively incremented, and in THAT case more S-locks on pages and/or rows will have to be requested to verify the committed state of data values.

Along comes Db2 12 for z/OS, and with it an important change in how CLSNs affect lock avoidance in a data sharing environment. The change: instead of a single GCLSN value reflecting the start time of the longest running still-out-there data-changing unit of work in the whole data sharing system, Db2 12 caches, in memory and in the shared communications area (SCA) in a coupling facility LPAR, the 500 oldest CLSNs in the data sharing system. What does that mean with regard to lock avoidance? Well, let's say that you're an application program accessing data in table TABLE_A, which is in table space TSPACE_A, which is group buffer pool-dependent and has a CLSN that is one of the aforementioned 500 oldest in the system. For you, lock avoidance is just as it would be in a non-data sharing system, because Db2 is using your table space's CLSN value to determine when it can retrieve committed data values from TABLE_A without having to issue S-lock requests for pages (or rows, as the case may be) in the table.

Now, let's say you're an application program accessing data in TABLE_X, which is in table space TSPACE_X, which is group buffer pool-dependent and has a CLSN that is not one of the 500 oldest in the system. What's the deal in that case? Here's the deal: for you, Db2 will use the CLSN that is the newest of the 500 oldest that it's cached. Is that going to be good with regard to the level of lock avoidance you get? It should be really good. Think about it. In a Db2 11 (or earlier) system, Db2 would be using, for you, for lock avoidance purposes, the oldest CLSN in the system. What if that CLSN is associated with a unit of work that's been chugging along for an hour, changing data in TABLE_Y (in TSPACE_Y), without committing? That really old CLSN (the global CLSN for the system) could mean a lot of lock requests not avoided for you. In a Db2 12 environment, the CLSN associated with the unit of work that started changing data in TABLE_Y an hour ago (and hasn't committed since) affects lock avoidance only for TSPACE_Y, not for TSPACE_X, which you're accessing. Does the next-oldest CLSN in the system, which is (let's suppose) 45 minutes old, affect the level of lock avoidance you're getting? Nope. How about the 10th oldest CLSN? Nope. The 100th oldest? Nope. The 352nd oldest? Uh-uh. Your level of lock avoidance is impacted by the 500th oldest CLSN in the system. Maybe that one is associated with a unit of work that has gone only a few seconds (or less) without a commit. Think you'll get more lock avoidance with a few-seconds-old CLSN versus an hour-old CLSN? Odds of that are real good, I'd say.

And there you have it - just one of multiple ways in which Db2 12 delivers improved performance versus its predecessor. Over the next several months in this blog I'll get to some of the other performance-boosting features of Db2 12. Lots to like there, I can tell you now.


Friday, September 29, 2017

Db2 for z/OS: Lock Avoidance

Not long ago, I served as part of the teach team for a Db2 12 for z/OS Technology Workshop that IBM offered in a city in the USA. During that workshop, I covered, among other things, a Db2 12 enhancement that can improve the CPU efficiency of a Db2 data sharing system by reducing locking activity in that environment. During a break, one of the workshop attendees told me (in words to this effect), "That lock avoidance enhancement is good news, but I think a fair number of Db2 for z/OS people are a little uncertain as to what lock avoidance is - and how it works - in the first place." I think this person was right on that score, and that's why I'm writing this blog entry: to facilitate understanding of the fundamentals of Db2 lock avoidance. In a follow-on entry that I'll try to post within the next couple of weeks, I'll cover the aforementioned Db2 12 data sharing-related lock avoidance enhancement.

OK, let's start with the "why" of Db2 lock avoidance. In support of a large application workload, Db2 does a lot of stuff. One thing that Db2 does about as much as anything else is issue lock and unlock requests. To see what I mean, take a look at a statistics long (aka statistics detail) report, generated by your Db2 monitor, covering a period of high activity on the system (or check the monitor's online display of lock activity). I looked at a report just now, reflecting activity in a large real-world production Db2 for z/OS subsystem, and what does it show? 106 million lock requests and 23 million unlock requests in one hour. That's about 30,000 lock requests per second, and about 6400 unlock requests per second, on average (there are more lock than unlock requests because multiple X-locks, acquired by an application process in the course of modifying data values, are released at a commit point with one unlock request). Now, the workload associated with all that lock activity is a big one, driving in excess of 700 million SQL data manipulation statements during the hour between the report's FROM and TO times, but that kind of volume is by no means unusual for a large Db2 site. Why the big lock request number? Two words: data integrity. Locks acquired on data pages or rows (depending on the granularity of locking in effect for a table space) help to ensure that data consistency is maintained in a Db2 for z/OS system, and that programs retrieve only committed data from the database (i.e., values that have been "hardened" in the database in the wake of data-change operations).

One lock request consumes very little in the way of CPU cycles, but tens of thousands of lock requests per second? That's another matter. When the execution volume for a given Db2 operation is really high, there is CPU-efficiency value in reducing the frequency of said operation. The Db2 development team had that in mind when, more than two decades ago, they delivered a capability called lock avoidance. Essentially, lock avoidance enables Db2 to reduce lock activity for an application workload by issuing S-lock requests (i.e., data-read locks associated with execution of queries) only when such locks are needed to ensure retrieval of committed data values.

How does Db2 know when an S-lock on.a page or row is needed to guarantee retrieval of committed data values (i.e., to avoid returning uncommitted data changes to a program)? It knows through the use of two lock-avoidance indicators: CLSNs and PUNC bits.

  • CLSNs - This acronym stands for "commit log sequence number." In every page of every Db2 page set, Db2 records the log point corresponding to the most recent update of the page's contents. Additionally, for every page set or partition (for partitioned objects), Db2 keeps track of the starting log point of the oldest still-in-flight unit of work that is changing data in the page set or partition. That latter log point is the commit log sequence number of the page set or partition. When going to retrieve data in a page, Db2 can check the page's last-updated log point and compare that to the CLSN of the page set or partition. Using simple numbers for example, suppose that a page's last-updated log point (information that, again, is stored in the page) is 20, and the CLSN of the page set or partition in which the page is located is 30. Because the starting log point of the oldest still-in-flight unit of work affecting the page set or partition is higher than (meaning, is later in time than) the log point of the last update of the page, Db2 knows that all the data in the page is in a committed state, so guaranteed-committed data can be retrieved from the page with no need for S-lock acquisition on the page (or on rows therein). When the  oldest still-in-flight data-changing unit of work affecting a page set or partition commits, the CLSN of the page set or partition moves up to the starting log point of what had previously been the next oldest still-in-flight data-changing unit of work affecting the page set or partition. Even if (using simple numbers as before) the last-updated log point of a page is seen to be 50, and the CLSN of the associated page set or partition is 40 (meaning that data in the page might have been updated by a still-in-flight data-changing unit of work), lock avoidance may still be possible for a data retrieval operation targeting the page, thanks to PUNC bits.  
  • PUNC bits - PUNC is short for "possibly uncommitted." Included in the control information on every Db2 data page and every index page are bits that indicate whether or not a row contains possibly uncommitted data (in the case of an index page, the PUNC bits are associated with the RIDs, or row IDs, that point to rows in the underlying Db2 table). When a row is changed, its PUNC bit is set. That being the case, when Db2 examines a row (or its RID in an index) and sees that the PUNC bit is NOT set, Db2 knows that the data in the row is committed, and the data can be retrieved without the need for an S-lock on the data page or row to ensure data committed-ness. So, why is this indicator called the "possibly" uncommitted bit, as opposed to the "for sure uncommitted" bit? Because Db2 does not synchronously reset a "turned on" PUNC bit when the data change that caused the bit to be set is committed - doing that would have an unacceptable overhead cost. Instead, PUNC bits that are turned on as a result of data-change activity are reset asynchronously, in the background, when certain events happen (one such event is execution of the REORG utility for a table space; another is when more than 25% of the rows in a page have their PUNC bits turned on and the page set's or partition's CLSN advances). Because of the asynchronous nature of PUNC bit resetting, relative to the turning on of a PUNC bit, the turned-on PUNC bit setting can only be interpreted as meaning, "Maybe the data in this row is committed and the PUNC bit hasn't been reset, or maybe the data is in fact not committed." When an application program wants only committed data values to be retrieved by queries (i.e., when isolation level UR, short for "uncommitted read," is not in effect for the program), "maybe" isn't good enough, and Db2 will request an S-lock on the row or page to ensure the committed state of data values (successful acquisition of an S-lock means that the page or row is not X-locked, and that means the row or page contains only committed data).

I want to impart now a couple more items of information pertaining to Db2 for z/OS lock avoidance. First, lock avoidance can be utilized to the maximum extent possible when a Db2 package is bound with ISOLATION(CS) - short for cursor stability - and CURRENTDATA(NO). Second, what I have described in this entry is lock avoidance as it occurs in a non-data sharing Db2 system. Lock avoidance in a Db2 data sharing environment - including the Db2 12 enhancement referenced at the start of this entry - will be the subject of my next post to this blog. I hope to have that written within the next two weeks, so check back in if you use - or are interested in - Db2 data sharing.

Wednesday, August 30, 2017

Db2 12 for z/OS SQL Enhancements: Advanced Triggers

In this, the fourth of a set of four entries covering SQL enhancements introduced with Db2 12 for z/OS, I will describe new trigger capabilities delivered by way of what are called "advanced triggers" (the kind of trigger that you could define prior to Db2 12 -- and which you can still create in a Db2 12 system -- is now referred to as a "basic trigger"). Before getting to the details of advanced trigger functionality, I want to emphasize that there are considerably more than four SQL-related enhancements delivered through Db2 12 -- I've just selected my four favorite of these enhancements for highlighting in this blog (the previous three entries in this series covered piece-wise DELETE, result set pagination, and the much-improved MERGE). A good source of information on the other SQL enhancements provided by Db2 12 is the "Application enablement" section of the "What's new" part of the Db2 12 Knowledge Center on the Web.

Note that advanced trigger functionality is available in a Db2 12 system when the activated function level is V12R1M500 or above.

OK, advanced triggers: the most important thing to know about this new kind of trigger is that it can contain (in the CREATE TRIGGER statement) a compound SQL statement. Basically, that means that you can define a trigger using SQL PL (aka SQL procedure language -- the same language that enables the creation of native SQL procedures and "native" SQL user-defined functions).

Before getting into the implications of SQL PL in the body of a trigger, I want to do a little level-setting. A trigger, for those who don't know, is a mechanism by which a data-changing action (INSERT, UPDATE, or DELETE) targeting one table can "trigger" the automatic execution of some other SQL action. A trigger can be "fired" before or after the "triggering" SQL statement has been executed. A very simple example: an AFTER UPDATE trigger can cause an UPDATE that changes column C1 of table T1 to drive an INSERT of some information into table C2.

Suppose you want the action taken when a trigger gets fired to be somewhat involved versus really simple. In that case, prior to Db2 12 it was often necessary to have the trigger call a stored procedure. That can be kind of clunky from a coding perspective. With the ability to code SQL PL in the body of an advanced trigger, you can drive a fairly sophisticated action when a triggering SQL statement executes, without having to put a stored procedure call in the trigger body (in essence, you can use SQL PL to put the equivalent of a native SQL procedure in an advanced trigger).

An example of an advanced trigger appears below, followed by some color-coded comments (this BEFORE INSERT trigger examines start and end times for classes in records to be inserted into a Db2 table, sets the end time to one hour after the start if the end time value is NULL, and returns an error if the class end time is after 9 PM):

CREATE TRIGGER MYTRIG01
BEFORE INSERT ON MYTAB
REFERENCING NEW AS N
FOR EACH ROW
ALLOW DEBUG MODE
QUALIFIER ADMF001
WHEN(N.ending IS NULL OR n.ending > '21:00')
L1: BEGIN ATOMIC
 IF (N.ending IS NULL) THEN
    SET N.ending = N.starting + 1 HOUR;
 END IF;
 IF (N.ending > '21:00') THEN
    SIGNAL SQLSTATE '80000'
    SET MESSAGE_TEXT = 'Class ending time is beyond 9 pm';
 END IF;
 SET GLOBAL_VAR = NEW.C1;
END L1#

Things to note about this advanced trigger:
  • You have some new options -- Because an advanced trigger can include SQL PL statements, you can debug it, just as you can debug a native SQL procedure or a compiled SQL scalar function (Data Studio is particularly handy for debugging SQL PL routines). Another new option for advanced triggers: you can provide a high-level qualifier to be used with unqualified objects referenced in the body of the trigger.
  • You can include SQL control statements (i.e., logic flow control statements) in the body of the trigger -- IF (shown in the example) is one such statement. Among the others are ITERATE, LOOP, REPEAT, and WHILE. These SQL PL statements enable the coding of a trigger that has pretty sophisticated functionality.
  • There are new possibilities for the SET statement -- With an advanced trigger, SET is not restricted to transition variables -- it can also be used with global variables and SQL variables (the latter term refers to variables declared in the body of the trigger).
And, there's something that's notable by its absence in the example CREATE TRIGGER statement -- namely, the phrase MODE DB2SQL. It is, in fact, the absence of MODE DB2SQL in a CREATE TRIGGER statement that indicates that the trigger will be an advanced trigger, as opposed to a basic trigger.

Besides providing advanced functionality versus basic triggers, advanced triggers eliminate what had been a really vexing problem encountered by many users of (what are now called basic) triggers -- a problem best illustrated by example. Suppose you create three basic triggers on a table, all of which are "fired" when a particular type of statement targets the table (e.g., an UPDATE of a certain column in the table), and all of which "fire" in the same relative time period with regard to the execution of a triggering SQL statement (e.g., all three are AFTER triggers). In that case, the order in which the triggers will fire is determined by the order in which they were created: if trigger A was created first, then trigger B and then trigger C, they will fire in that order (A then B then C) upon the execution of a triggering SQL statement. Let's say that this A-B-C trigger firing sequence is important to you. Now, suppose that that trigger A has to be modified. Only way to get that done with a basic trigger is DROP and re-CREATE. But wait! That re-CREATE will make trigger A the last of the three triggers in this example to be created, resulting in an undesirable firing order of B then C then A when a triggering SQL statement is executed. How do you change trigger A and preserve the desired A-B-C firing order? Here's how: you DROP all three triggers, then re-CREATE all three in A-B-C order. What a hassle!

Along comes advanced trigger functionality, and this problem is solved -- and not only solved, but fixed by your choice of three options. All three of these hassle-free trigger modification options are made possible by the fact that advanced triggers, like native SQL procedures and compiled SQL scalar functions, have versions. So, back to the example of the preceding paragraph: you have triggers A, B, and C, and you want to change A while maintaining the A-B-C firing sequence. If they are now advanced triggers, no problem! Here are your three -- count 'em: three -- options for changing advanced trigger A without messing up the firing sequence of the triggers (I'll assume that the current version of trigger A is V2, and I'll highlight syntax that is new with Db2 12 in red):

  • CREATE OR REPLACE TRIGGER A VERSION V2…
  • ALTER TRIGGER A REPLACE VERSION V2…
  • ALTER TRIGGER A ADD VERSION V3… followed by ALTER TRIGGER A ACTIVATE VERSION V3
Choose any of the above options, and you successfully modify trigger A while maintaining the A-B-C firing sequence of the three triggers, without having to DROP and re-CREATE triggers B and C. And the crowd goes wild!

Well, there you have it: another incentive to get to Db2 12 for z/OS (and to activate function level V12R1M500 or later). Enjoy those Db2 12 SQL enhancements, folks!

Thursday, August 17, 2017

Db2 12 for z/OS SQL Enhancements: a Better MERGE

Greetings, and welcome to the third in a series of entries covering my favorite Db2 12 for z/OS SQL enhancements. The first two posts in the series described piece-wise DELETE and result set pagination. This entry focuses on the new and very much improved MERGE functionality delivered via Db2 12 (usable at function level V12R1M500 and above).

The MERGE statement, introduced with Db2 9 for z/OS, is sometimes referred to as the "upsert" statement, because it enabled, via a combination of update and/or insert operations, the "merging" of one "table" into another (I'll explain the quotes around "table" momentarily): in a MERGE statement, you'd indicate what constitutes a match between an input "table" "row" (again, quote marks to be explained) and a target table row, and where a match exists the target table row is updated with information in the matching input "table" "row," and when a match doesn't exists the input "table" "row" is inserted into the target table. That functionality sounds pretty useful, but the initial Db2 for z/OS implementation of MERGE left a good bit to be desired (read on to see what I mean).

What's great about the new versus the old Db2 for z/OS MERGE statement is best shown by way of example. Here is what a pre-Db2 12 MERGE statement might look like, with color-coded complaints following:

MERGE INTO ACCOUNT AS A
USING (VALUES (:hv_id, :hv_amount)
FOR 3 ROWS)
AS T (ID, AMOUNT)
ON (A.ID = T.ID)
WHEN MATCHED
 THEN UPDATE SET BALANCE = A.BALANCE + T.AMOUNT
WHEN NOT MATCHED THEN INSERT (ID, BALANCE)
VALUES (T.ID, T.AMOUNT)
NOT ATOMIC CONTINUE ON SQLEXCEPTION;

Complaint: the input "table" has to be represented as a series of host variable arrays -- one for each "column" of the "table" -- Thus the quotation marks I've been putting around input "table" and "row." It's not really a Db2 table that's used for MERGE-input purposes. It's a clunky representation of a table. What a hassle.

Complaint: only a very simple row-match qualification can be specified, and only a simple pair of actions are possible: when a match is found, do this update, and when a match is not found, do this insert -- Not a lot of sophistication or flexibility here.

Complaint: the NOT ATOMIC CONTINUE ON SQLEXCEPTION clause, which indicates that input "rows" (if there are several of them) are processed separately, with processing continuing in the event that an error is encountered for a given row, is required -- You might not want that behavior.

And, on top of all that, a target table row can be operated on more than once in a single execution of a pre-Db2 12 MERGE statement (it could be inserted and subsequently updated) – Again, you might not want that behavior.

Along comes Db2 12 (at function level V12R1M500 or above, as previously noted), and boy, is MERGE ever better than it was before. I'll stick with the color-coded-example approach to show how MERGE has been enhanced. A Db2 12 MERGE might look like this:

MERGE INTO RECORDS AR
USING (SELECT ACTIVITY, DESCRIPTION, DATE, LAST_MODIFIED
FROM ACTIVITIES_GROUPA) AC
ON (AR.ACTIVITY = AC.ACTIVITY) AND AR.GROUP = ’A’
WHEN MATCHED AND AC.DATE IS NULL THEN SIGNAL SQLSTATE ’70001’
SET MESSAGE_TEXT =
AC.ACTIVITY CONCAT ’ CANNOT BE MODIFIED. REASON: DATE IS NOT KNOWN’
WHEN MATCHED AND AC.DATE < CURRENT DATE THEN DELETE
WHEN MATCHED AND AR.LAST_MODIFIED < AC.LAST_MODIFIED THEN
UPDATE SET
(DESCRIPTION, DATE, LAST_MODIFIED) = (AC.DESCRIPTION, AC.DATE, DEFAULT)
WHEN NOT MATCHED AND AC.DATE IS NULL THEN SIGNAL SQLSTATE ’70002’
SET MESSAGE_TEXT =
AC.ACTIVITY CONCAT ’ CANNOT BE INSERTED. REASON: DATE IS NOT KNOWN’
WHEN NOT MATCHED AND AC.DATE >= CURRENT DATE THEN
INSERT (GROUP, ACTIVITY, DESCRIPTION, DATE)
VALUES (’A’, AC.ACTIVITY, AC.DESCRIPTION, AC.DATE)
ELSE IGNORE;

Love it: with Db2 12's new-and-improved MERGE, you can actually merge an honest-to-goodness Db2 table with another table; or, you can merge a view with a table; or (as shown in this example), you can merge the result set of a SELECT statement with a table -- So long, host variable arrays. Can't say I'll miss you.

Love it: in one MERGE statement, you can have a multitude of “when matched” and “when not matched” clauses, differentiated through various additional predicates -- Benefit: a number of update, insert, and/or delete actions can be driven via execution of a single MERGE statement (and, yes, the ability to drive DELETE operations -- in addition to UPDATE and DELETE -- via a MERGE statement is another part of the "new-and-improved" story).

Love it: you can use the SQL statement SIGNAL to provide customized error codes and messages -- Maybe you could have some fun with that, like, SET MESSAGE TEXT = 'BETTER LUCK NEXT TIME'

Love it: there's a new IGNORE option -- MERGE input rows probably hate to be ignored, but sometimes it's just necessary.

And on top of all that, with the new-and-improved MERGE, a target table row can be operated on (via INSERT, UPDATE, or DELETE) only once -- Personally, I kind of like that.

And one more thing: if an error is encountered during execution of a new-and-improved MERGE, the whole statement is rolled back -- It's all or nuthin'.

Now, if you find yourself getting all sentimental and missing the pre-Db2 12 MERGE statement, you can always bring it back: just include in your coding of the statement the NOT ATOMIC CONTINUE ON SQLEXCEPTION clause. Do that, and MERGE behavior will be as it was prior to Db2 12, right down to the requirement that input "table" "rows" be in the form of host variable arrays.

Me? I'll take the new MERGE.

Friday, July 21, 2017

Db2 12 for z/OS SQL Enhancements: Result Set Pagination

In this second of a series of entries on my favorite SQL-related enhancements provided via Db2 12 for z/OS (and "Db2," versus "DB2," is officially the new form of the product name), I will describe useful functionality pertaining to result set pagination (my first post in this series covered piece-wise DELETE).

"Result set pagination" refers to retrieval of a query result set in "pages," with each page including a sub-set of the result set's rows. A typical scenario might involve presentation to a user of employee names (with, perhaps, accompanying information), starting with a particular name and going forward from there in ascending sequence based on last name followed by first name. Let's suppose that such a request has been initiated by a user, with the supplied "starting point" name being MARY ERIKSON.

Prior to Db2 12 for z/OS, the programmatic way of dealing with such a request was not entirely developer-friendly, in a couple of respects. First, there's the matter of generating the desired result set. That required a predicate with the following syntax:

WHERE (LASTNAME = ‘ERIKSON’ AND FIRSTNAME >= ‘MARY’) OR (LASTNAME > ‘ERIKSON’)

A little clunky; and, it gets clunkier as more columns are examined (as would be the case if, for example, the request were to display names in ascending order by last name, then first name, then middle name, beginning with MARY TAYLOR ERIKSON).

The second not-totally-developer-friendly issue arises when the second page (and subsequent pages) of the result set has to be retrieved. Getting the first page is easy: you just fetch the first, say, 20 rows of the result set (if rows are to be displayed 20 to a page). How about getting the second page of 20 rows? Generally speaking, you could retrieve page n+1 of the result set by taking the last row in page n and plugging values from that row into a query predicate that would get you to the start of the next page's rows (and then fetch forward from there), utilizing the rather cumbersome syntax shown above. Referring to the example I've been using, if page 1 of the name-ordered result set ends with SAMUEL FIGGINS, page 2 could be retrieved via a query coded with the following predicate:

WHERE (LASTNAME = ‘FIGGINS’ AND FIRSTNAME > ‘SAMUEL’) OR (LASTNAME > ‘FIGGINS’)

Again, this gets clunkier if more than two columns are to be considered for row ordering.

Alternatively, you could use ordinal positioning within the result set to generate succeeding pages, perhaps using a query with the OLAP specification ROW_NUMBER; or, you could just issue your initial result-set-building query, and fetch the first 40 rows and throw away the first 20 of those to display rows 21-40 in page 2 (an approach that gets progressively more wasteful, overhead-wise, as you progress through the result set's pages). Maybe you could use a scrollable cursor (though in some cases it might not be feasible to leave a cursor open for the purpose of retrieving succeeding pages of rows). In any case, you're looking at a level of coding complexity that is probably greater than you'd like.

Things get easier all around with Db2 12 (at function level V12R1M500 or later). For starters, initial result set generation is easier because of a newly supported predicate syntax. Want rows in last name, first name sequence, starting with ERIKSON, MARY? Just use this predicate:

WHERE (LASTNAME, FIRSTNAME) => (‘ERIKSON’, ‘MARY’)

And that simple syntax stays simple if more columns (e.g., middle name) are considered for row ordering.

Retrieval of succeeding result set pages also gets easier with Db2 12. You actually have a couple of alternative approaches, both simpler versus the pre-Db2 12 options. One option would be to take the last name of page n of the result set and use it to get to the starting point of page n+1, utilizing the simple syntax shown above. If, for example, FIGGINS, SAMUEL is in the last row of page 1 of our example result set, we can get to the starting point of page 2 by way of this predicate:

WHERE (LASTNAME, FIRSTNAME) => (‘FIGGINS’, ‘SAMUEL’)

The second of the two new options also involves the use of SQL syntax previously unavailable with Db2 for z/OS: to get (keeping with the same example) the second page of 20 result set rows, just re-issue the query initially used to generate the result set, with this clause added (the new-with-Db2 12 part is underlined):

OFFSET 20 ROWS FETCH FIRST 20 ROWS ONLY

To get page 3, you'd issue the initial query with the clause OFFSET 40 ROWS FETCH FIRST 20 ROWS ONLY, and so on for subsequent pages, each time bumping the value "n" in OFFSET n ROWS up by 20, if you want 20 rows per page (and for the sake of consistency, you might issue the query used to get page 1 with the clause OFFSET 0 ROWS FETCH FIRST 20 ROWS ONLY). Note that both the OFFSET value and the FETCH FIRST value can be provided via variables, so the clause could look like this:

OFFSET :hv1 ROWS FETCH FIRST :hv2 ROWS ONLY

That way, the initial query can be used again and again as a user pages through the result set, with values provided as needed to get to the starting row of page n (and this approach also allows for the number of displayed rows to change, if desired, as pages beyond the first, or beyond the first few, are displayed).

Of the two new (with Db2 12) SQL pagination options, which should you use? The OFFSET n ROWS FETCH FIRST m ROWS approach delivers maximum coding simplicity. The other approach (plug values from the last row of page n into a query predicate to get to the starting point of page n+1, utilizing the simplified syntax shown previously) can provide a performance edge, as it takes you directly to the start of page n+1, whereas the OFFSET n ROWS option requires skipping over the first "n" rows of the initial result set -- an action that has some overhead cost (the performance difference could increase as the value of "n" in the OFFSET n ROWS clause gets larger). You can decide whether you want to go the maximally simple route, or the not-quite-as-simple-but-still-pretty-simple route that can be more efficient.

Whichever choice you make, it's a better one, from a developer's perspective, than the options available in a pre-Db2 12 environment.

More of my favorite Db2 12 SQL-related enhancements will be covered in subsequent posts to this blog, so check back in.

Friday, June 30, 2017

DB2 12 for z/OS SQL Enhancements: Piece-Wise DELETE

DB2 12 for z/OS, which became generally available in October of 2016, delivered a number of key enhancements pertaining to SQL statements. Over the course of a few blog entries, I'll describe some of my favorites among these enhancements. This first post in the series will cover piece-wise DELETE.

First, the backdrop: suppose you have a need to remove a large number of rows from a table, using SQL (I say, "using SQL," because a utility-based alternative, REORG with DISCARD, is not always feasible or desirable). How might you do that? Well, if the requirement is to remove all rows from table T1 that have a value greater than 100 in column C1, you could simply execute this statement:

DELETE FROM T1 WHERE C1 > 100;

Done, right? No so fast. How many rows are qualified by the C1 > 100 predicate? Suppose T1 has 500 million rows, and 50 million of those rows have a value greater than 100 in column C1? In that case, what happens when the DELETE statement shown above is executed? For one thing, the application process associated with the DELETE will potentially acquire a LOT of page or row locks (depending on whether LOCKSIZE PAGE or LOCKSIZE ROW is in effect for the table space in which T1 resides). That, in turn, could lead to the deleting program getting an error SQL code if the limit on the number of locks that can be acquired by one process (NUMLKUS in ZPARM) is reached. If that limit is not reached because lock escalation occurs (as it might, depending on the NUMLKTS value in ZPARM and the LOCKMAX setting in effect for T1's table space), other programs will not be able to access T1 until the DELETE statement completes and a COMMIT is issued by the deleting program. If the value of NUMLKUS in ZPARM is high enough so as to not be hit when the DELETE executes (or if NUMLKUS is set to 0, meaning there is no limit on the number of locks that can be held at one time by one process), and if lock escalation does not occur (because NUMLKTS is set to 0 or to a very high number, or if the table space's LOCKMAX value is 0 or a very high number), IRLM could run short on storage because of all the lock control blocks (and, in a DB2 data sharing environment, the lock list portion of the lock structure could fill up) -- not good for system stability. Oh, and one more thing: suppose that 50-million-row-impacting DELETE statement gets through 49 million rows and then encounters a problem that causes it to fail. What are you looking at then? One BIG rollback operation.

OK, so removing lots and lots of rows from a table with a single DELETE statement can be problematic. Is there a SQL alternative? Yes, but the alternative available in a pre-DB2 12 environment is not one about which an application developer would be enthused. I'm talking here about a cursor-based DELETE approach. That would involve writing a program that declares a cursor through which the to-be-deleted rows would be identified (using the preceding example, row identification would be done via a SELECT from T1 with the WHERE C1 > 100 predicate), with the DECLARE CURSOR statement including the WITH HOLD option to maintain cursor position through commits (the FOR UPDATE clause might also be required for the DECLARE CURSOR statement). The program would then OPEN the cursor, FETCH a row, and perform a DELETE WHERE CURRENT OF operation. Then FETCH another row, do another DELETE WHERE CURRENT OF, and so on, with commits performed periodically so as to release acquired locks, until all to-be-removed rows have been deleted from the target (a variant of this approach involves deleting blocks of rows using a cursor declared WITH ROWSET POSITIONING). Do-able, yes, but kind of clunky from a programming perspective. Wouldn't it be nice if there were a SQL mechanism through which large-scale DELETEs could be accomplished in a way that would be programmer-friendly while at the same time avoiding the problems associated with massive accumulation of locks (or with lock escalation)? Well, thanks to DB2 12 for z/OS (at function level V12R1M500 or above), that mechanism is here. It's called piece-wise DELETE.

Using piece-wise DELETE, how easy is it to perform a large-scale row-removal operation that does not gum up the works, concurrency-wise? This easy (and here I'll assume that we want to remove rows from the target table in chunks of 500):

  1. Issue DELETE FROM T1 WHERE C1 > 100 FETCH FIRST 500 ROWS ONLY;
  2. Issue a COMMIT
  3. Repeat steps 1 and 2 until all to-be-removed rows have been deleted
The part of the DELETE statement highlighted in red above is syntax that's newly supported with DB2 12 (again, at function level V12R1M500 or above): FETCH FIRST n ROWS ONLY can be coded in a DELETE statement, and doing that makes the DELETE of the piece-wise variety. Every execution of the statement (as coded above) will remove 500 rows from the target table.

What if you sometimes want to use your piece-wise DELETE program to remove rows in chunks of 50, or 1000, instead of 500? No prob. You can use a variable to specify the to-be-deleted quantity in the DELETE statement's FETCH FIRST clause. And, if you're using a variable to provide the FETCH FIRST value, you can change the row-removal "chunk" size as a large-scale DELETE operation is executing.

There you have it -- an easy-to-program means of breaking up a DELETE operation into bite-sized pieces (or bite-sized units of work, in more technical parlance). In blog posts to come I'll cover others of my favorite DB2 12 SQL enhancements. I hope you'll check those out.

Wednesday, May 24, 2017

DB2 for z/OS and Data-as-a-Service and Database-as-a-Service

______-as-a-service is all the rage in IT these days, and understandably so -- the "service-ization" of information technology capabilities and interfaces will be transformational; indeed, transformations have already occurred or are underway at many organizations and within many application development and deployment environments. In this blog entry I want to highlight a couple of ______-as-a-service concepts, data-as-a-service (aka DaaS) and database-as-a-service (aka DBaaS), and examine their relevance to DB2 for z/OS. First, a little level-setting terminology clarification is in order.

Data-as-a-service and database-as-a-service -- NOT the same thing

Some folks use the terms "data-as-a-service" and "database-as-a-service" interchangeably, but they do not have the same meaning. The distinction is indicated by the different first words of the terms. When your're talking database-as-a-service, you're talking about the functionality of a database management system, provided as a service. So, what does that mean? Well, it can mean that the database management system in question is deployed in an off-premise cloud environment, but database-as-a-service can also apply to a DBMS that is deployed in an on-premise fashion, and that brings me to what I feel is a very important point: "cloud" doesn't necessarily mean "off-premise," and "off-premise" doesn't necessarily mean "cloud." Take the latter part of this point I just made. For a LONG time, numerous organizations have contracted with various companies (IBM among them) to manage their IT operations. IT outsourcing company X might run organization Y's IT operations out of a data center owned by X. That's off-premise from Y's perspective, but there may be nothing cloudy about the way in which Y's IT operations are managed by X. Similarly, an organization could manage some or all of its IT resources in a cloudy way, even with all the IT resources managed in that way being on-premise. Cloud is about consumption of ______ as a service. To those who consume the service(s) provided, the location of the mechanism of service provision is (or should be) irrelevant.

OK, if database-as-a-service can be implemented in an on-premise way (and again, it can), what kind of services are we talking about, and who consumes those services? Quite often, the consumers are application developers, and the services provided are things like a new schema, or a new database instance, or a data backup or restore operation. More on this a little later on.

While database-as-a-service is concerned with the requisitioning and provisioning of database management system functionality, data-as-a-service is about the programmatic interface to a data server. "Database" is not part of the term because a user, who could be a developer writing a client application, has no need to know (and likely has no desire to know) whether or not a database management system is on the server side of a data request. Many application developers -- even those with strong SQL skills -- see the mechanism by which a data request is serviced (which could be a relational database management system such as DB2 for z/OS) as being part of back-end "plumbing," and plumbing is not their concern. They want a straightforward and consistent interface through which CRUD data services (create, read, update, delete) can be invoked -- an interface that abstracts virtually all of the particulars of the service-providing system. REST (short for REpresentational State Transfer) is one such interface, and in recent years it has become a very popular means of enabling data-as-a-service application architecture.

Data-as-a-service in a DB2 for z/OS context

Responding to the growing preference on the part of application developers for as-a-service interaction with data-providing systems, IBM built a native REST interface into DB2 12 for z/OS, and retrofitted that interface to DB2 11 (via the fix for APAR PI66828). DB2's native REST interface is an extension of the DB2 distributed data facility (DDF), so it leverages existing DDF capabilities such as thread pooling, profile monitoring, workload classification, and accounting and statistics tracing. Another benefit of the DDF tie-in: a SQL statement invoked by way of DB2's REST interface will execute under a preemptible SRB in the DDF address space, making the statement zIIP-eligible (up to 60%).

By way of DB2's REST interface, a single static SQL statement can be exposed in the form of a REST API. Before you start thinking of that "single statement" thing as overly limiting, consider that the single statement can be a call to a DB2 stored procedure (if you decide to make a stored procedure a RESTful service, I'd recommend going with a native SQL procedure, because a native SQL procedure is up to 60% zIIP-eligible when it executes under a DDF preemptible SRB).

In addition to enabling invocation of SQL statements via REST calls, DB2's REST-related capabilities support creation, discovery, management, and securing of SQL-based RESTful services. That functionality becomes more extensive (and more user-friendly) when z/OS Connect front-ends DB2's REST interface (i.e., when DB2 is a REST provider to z/OS Connect); and, z/OS Connect makes all kinds of z/OS-based programmatic assets available to REST clients: not only SQL statements (and stored procedures), but CICS transactions, IMS transactions, WebSphere Application Server for z/OS transactions, and batch jobs, as well.

Database-as-a-service in a DB2 for z/OS context

Quite a lot of work has been going on behind the scenes to provide database-as-a-service capabilities for DB2 for z/OS, and the first fruits of those efforts recently showed up in the form of a pair of DB2 APARs -- more on that in a moment.

Recall my mentioning, early on in this blog entry, that a raison d'etre for an on-premise database-as-a-service implementation could be easy and speedy requisitioning and provisioning of DBMS functions to support the work of application developers. The team within IBM DB2 for z/OS development that is leading the database-as-a-service charge has been focused on precisely that. And so it is that we have new DB2 for z/OS database-as-a-service capabilities that have recently been introduced via the fixes for APARs PI73168 (for DB2 11) and PI73492 (for DB2 12). These APARs leverage the functionality of zOSMF (the z/OS Management Facility), a component of the z/OS operating system that can drive the execution of "workflows" (roughly analogous to what are referred to as "scripts" in the distributed systems world) in response to REST calls. The APARs make available zOSMF workflows that can provide three database functions in an as-a-service manner: creation of a new database, creation of a new schema, and creation of a copy of an existing schema. The latter function, referred to as "schema like" (as in, "create a new schema like that one"), involves using, in addition to z/OSMF, the capabilities of the IBM DB2 Change Management Solution Pack for z/OS, which itself combines the functionality of the IBM DB2 Administration Tool and the IBM Object Comparison Tool.

More DB2 for z/OS database-as-a-service capabilities are being worked on, so watch this space.

All of this is good news for people who support DB2 for z/OS. Making data managed by DB2, and making database functionality provided by DB2, more easily consumable will help development teams to be more agile and more productive in building applications that interact with DB2. Additionally, more fully automating the requisitioning and provisioning of DB2 database management system resources will free DBAs from lots of "blocking and tackling" work around things like setting up new databases and schemas in development environments, and that should enable them to spend more time engaged in higher-value activities like working with developers on the design and architecture of new DB2-accessing applications.

Sunday, April 30, 2017

DB2 for z/OS: Something You Might Not Know About Large Page Frames in a Post-DB2 10 Environment

I well remember the first time I saw it. August, 2016. I was looking at output from the DB2 for z/OS command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL, issued on a DB2 11 system, that a client had sent to me for review. I appreciated the fact that I could see in this command output information about DB2's use of large real storage page frames for page-fixed buffer pools (in a DB2 10 environment, one had to issue -DISPLAY BUFFERPOOL -- for an individual pool, versus for all active pools -- with an odd syntax to get this information). I saw pools for which PGFIX(YES) had been specified, and observed that, as expected, DB2 showed that the preferred real storage frame size for these pools was 1 MB. Then I noticed something I couldn't explain: there was a buffer pool, with a PGFIX(YES) specification, and DB2 was telling me, 

DSNB546I  - PREFERRED FRAME SIZE 4K

Huh? Why would DB2 prefer 4 KB-sized page frames for that pool? PGFIX(YES) buffer pools are supposed to be backed by 1 MB page frames, right? The indicated preference for 4 KB page frames was not the result of the pool being smaller than 1 MB: the pool's VPSIZE value was several thousand, and all it takes is 256 buffers of 4 KB apiece to fill a 1 MB frame. I continued through the -DISPLAY BUFFERPOOL output, and got even more confused. There were some PGFIX(YES) pools for which 1 MB was the indicated page frame size preference, and other PGFIX(YES) pools for which 4 KB was seen to be the preferred real storage page frame size. There did not appear to me to be any rhyme or reason for this variance. I moved on, and ultimately forgot about the strange-looking DISPLAY BUFFERPOOL output.

Fast forward about seven months, and I'm talking to a group of people about DB2 for z/OS buffer pools and large real storage page frames. I told my audience that DB2 will automatically seek to allocate a PGFIX(YES) buffer pool using 1 MB page frames. "That's not true," said a person in the room. I was pretty sure that this guy was wrong on that point, but in the interest of time I told him, "Let's talk about this offline." Over the next couple of days, this individual and I exchanged e-mails on the topic, and he made a very good argument in support of his contention and I tried some things on an IBM-internal DB2 for z/OS subsystem and, by golly, the man was right.

Here's the deal: in a DB2 10 for z/OS environment, 1 MB page frames will be preferred for a buffer pools for which PGFIX(YES) has been specified. In a DB2 11 (or later) system, that may be the case. Why "may," instead of "will?" Because DB2 11 introduced the FRAMESIZE option for the -ALTER BUFFERPOOL command. That option lets you tell DB2 of your preference for page frame size for a buffer pool, and for a PGFIX(YES) pool your preference becomes DB2's preference (assuming that there are enough buffers, per the pool's VPSIZE value, to fill at least one of the page frames of the size indicated via the FRAMESIZE specification). As DB2 10 did not have a FRAMESIZE option for ALTER BUFFERPOOL, that specification was, in essence, implicitly conjoined with the PGFIX specification -- when you requested, in a DB2 10 system, that a pool's buffers be fixed in the z/OS LPAR's real storage, you were also implicitly requesting allocation of the pool's buffers in 1 MB real storage page frames (subject to the availability of such page frames in the LPAR).

DB2 11 for z/OS introduced a choice for real storage page frame size preference for a page-fixed buffer pool: 1 MB or 2 GB. The FRAMESIZE option was added to the -ALTER BUFFERPOOL command to enable explicit specification of the preferred page frame size for a pool. It was made pretty clear that use of 2 GB page frames for a buffer pool required a FRAMESIZE(2G) specification for the pool. Lots of people (myself included) assumed that specifying FRAMESIZE(1M) was not required if one wanted a page-fixed buffer pool to be allocated using 1 MB page frames. We in this camp believed that FRAMESIZE(1M) was automatically specified "under the covers" in a DB2 11 system when an -ALTER BUFFERPOOL command with PGFIX(YES) was issued. As it turns out, that was a FALSE ASSUMPTION on our part. The facts are these: first, in a DB2 11 (or later) system, 1 MB will be the preferred real storage page frame size, even without a FRAMESIZE(1M) request, for a pool that was page-fixed and in-use in the DB2 10 environment, prior to the migration of the system to DB2 11. Second, for a pool newly allocated in a DB2 11 (or later) environment, an explicit FRAMESIZE(1M) specification is required if one wants 1 MB to be the preferred real storage page frame size for a pool.

To put this another way: as part of the migration of a DB2 10 system to DB2 11, existing page-fixed buffer pools were "migrated" to the DB2 11 environment with an automatic FRAMESIZE(1M) specification added to the pools' definition. That makes sense, because it preserves the behavior of the pools in the DB2 11 environment that was in effect in the DB2 10 system. For new pools allocated in a DB2 11 (or later) system, because requesting a page frame size preference is a separate action from requesting page-fixed buffers, you have to explicitly specify FRAMESIZE(1M) or FRAMESIZE(2G) in order make the preferred frame size 1 MB or 2 GB, respectively.

Here are your take-aways:

  1. If your DB2 for z/OS environment is at Version 11 (or later), issue the command DISPLAY BUFFERPOOL(ACTIVE) DETAIL, and examine the output for each pool. Do you see any pools for which the PGFIX attribute is YES, and the indicated PREFERRED FRAME SIZE is 4K? If yes, it is likely that those pools were in use before the system was migrated to DB2 11. If you want 1 MB page frames to be used for the pools, alter them with a FRAMESIZE(1M) specification.
  2. Even if all of the PREFERRED FRAME SIZE attributes for your buffer pools are as you want them to be, give some thought to altering each pool (or at least each pool for which PGFIX(YES) is an attribute) with the appropriate FRAMESIZE specification. I think that there is value in having an explicit FRAMESIZE specification for each of your buffer pools (or, again, at least each of the page-fixed pools), even if that explicit frame size preference is the same as the one that is in effect by default for a pool. This serves to get you into the "make it clear by making it explicit" mind set, and that can be a sound aspect of your buffer pool management strategy.
As always, thanks for visiting the blog.

Sunday, March 26, 2017

DB2 for z/OS: Running REORG to Reclaim Disk Space

Think of why you run the DB2 for z/OS REORG utility, and a number of reasons are likely to come quickly to mind: to restore row order per a table's clustering key; to reestablish free space (for inserts and/or for updates); to remove the AREO* status set for a table space following (for example) an ALTER TABLE ADD COLUMN operation; or to materialize a pending DDL change such as an enlargement of a table space's DSSIZE. How about disk space reclamation? If that REORG motivation has not previously occurred to you, perhaps it should.

Recently, a DBA at a large DB2 for z/OS site communicated to me the success that his team has had in reclaiming substantial amounts of disk space through online reorganization of certain table spaces. He also asked for a recommendation with regard to identifying table spaces for which a REORG could potentially deliver significantly reduced disk space consumption. In this blog entry, I'll describe the disk space reclamation scenario reported by the referenced DBA, I'll explain why there was space to be reclaimed in some of the table spaces administered by the DBA, and I'll provide the "reclamation indicator" metric that I suggested to the DBA as a means of identifying table spaces that could be reorganized in order to free up disk space.

First, the scenario. At the DBA's site, there are some tables, in segmented table spaces ("traditional" segmented table spaces, as opposed to universal table spaces, which also happen to be segmented), that have these key characteristics: they are clustered by a continuously-ascending key (so that "new" rows go to the "end" of the table), and the number of inserts into the table is roughly equaled by the number of rows that are deleted from the table over a period of time.

The DB2 DBA knew that for table spaces with the above-described characteristics, REORGs were not needed to maintain "clusteredness," because of the continuously-ascending clustering key that sent new rows to the end of the table (at least, clustering would remain in good shape until the table space reached its size limit -- more on this in a moment). For the same reason, free space for inserts in "interior" pages of the table space was not a concern. Still, with DB2 real-time statistics showing a very large number of inserts since the last REORG of a couple of these table spaces, the DBA determined that reorganizations might be in order. Online REORGs of the table spaces were executed, and the result was a freeing up of 64 GB of disk space: one table space went from 21 to 4 data sets of 2 GB apiece, and the other went from 17 data sets to 2 (a DB2 segmented table space is comprised of up to 32 data sets of 2 GB apiece, and that is why its size limit is 64 GB).

Why was there so much unused space in these table spaces? Because the continuously-ascending clustering key kept pushing the "end" of the table spaces "outward." Why would that happen? Why would DB2 grow these table spaces as a result of inserts, given the like number of row-delete operations that were targeting the associated tables? Shouldn't DB2 have been using the space freed up by deletes to accommodate new inserts, without growing the table space's size? Actually, DB2 was working as designed. It's true that, given a continuously-ascending clustering key and some deletes of older rows from the "front" of a table space, DB2 can "wrap" to the front and start inserting new rows in space cleared by deletes, but that will only happen if DB2 cannot extend the table space (i.e., if DB2 cannot make the table space larger). If DB2 can extend a segmented table space, it will in order to preserve a table's row-clustering order; so, in advance of hitting the 64 GB size limit for a segmented table space, DB2 would keep making the table space larger so that it could keep adding rows to the end of a table (assuming a continuously-ascending clustering key), and deletes of older rows would result in ever-larger amounts of available-but-unused space in the table space. That's why the disk footprint of the two table spaces became so much smaller following reorganization.

[It is important to keep in mind that, given a continuously-ascending clustering key and at least some row-delete operations, DB2 will insert new rows in the "front" of a segmented table space, using space freed up by DELETEs, if the table space cannot be made any larger (either because of reaching the 64 GB limit or as a result of running into a maximum-extents or a maximum-volumes situation). In that case, "wrapping to the front" for new inserts is better than failing the inserts.]

Having achieved this disk space reclamation success through REORGs, the aforementioned DBA wanted to add "potential for significant disk space reclamation" to the criteria used at his site for identifying table spaces that should be reorganized (a good proactice -- REORG table spaces when you have a good reason for doing so, not just "because it's been X amount of time since the last time this table space was REORGed"). How could he and his colleagues spot table spaces with large amounts of unused space? My recommendation: use for this purpose the ratio of disk space occupied by the table space to space in the table space occupied by rows in the table space. For the numerator, I'd use the SPACE value in the row for the table space in the SYSTABLESPACE catalog table. That value is updated when the STOSPACE utility is executed, so you would want to run STOSPACE on a regular basis (that should not be a big deal -- STOSPACE should be a very inexpensive utility to execute, CPU-wise). For the denominator, I would use the product of TOTALROWS from SYSTABLESPACESTATS (set by REORG and updated when INSERTs and DELETEs are executed) and AVGROWLEN in SYSTABLESPACE (updated by RUNSTATS, or by in-line statistics collected during REORG or LOAD). You can decide when that ratio would prompt you to run REORG to reclaim space. Would you do that when disk-space-to-row-space hits 2 (i.e., when the size of the table space is 2X the space occupied by rows)? When it hits 3? When it hits 4? One of those values might be reasonable for your environment.

One more thing: I have focused on traditional segmented table spaces in this blog entry because that is the table space type to which space reclamation via REORG is most relevant. For a range-partitioned table space, a given partition's size limit is determined by the DSSIZE specification, and the same is true for a partition-by-growth table space. Yes, you could see a partition-by-growth table space come to contain a high percentage of unused space given the combination of a continuously-ascending clustering key and a good deal of DELETE activity, but you could put a limit on that growth by way of a not-larger-than-needed MAXPARTITIONS value. With that said, even with range-partitioned and partition-by-growth table spaces you could see situations in which the ratio of table space size to space occupied by rows (the ratio described in the preceding paragraph) could get to be high enough to make a REORG attractive from a disk space reclamation perspective. And here there's some good news: starting with DB2 11 for z/OS, you can tell DB2 to drop partitions of a partition-by-growth table space made empty by a REORG or the entire table space (that functionality is enabled via the REORG_DROP_PBG_PARTS parameter in ZPARM).

So, add disk space reclamation to your reasons for running REORG (if you have not already done so), and consider using the ratio I've provided to look for candidate table spaces.