Tuesday, June 28, 2016

DB2 for z/OS: Indexes and Foreign Keys

The functionality of DB2 for z/OS is enhanced, in ways small and big, on a regular basis. There's plenty to keep up with, and it is understandable that some product improvements could fly under the radar of a database administrator and escape notice. I encountered this situation a few weeks ago. During a presentation on new features and functions of DB2 11, delivered to an audience comprised mostly of DB2 DBAs, I was asked, "When will we see a lifting of the restriction that requires an index on a foreign key to be defined on just that key in order to be useful for improving referential integrity performance? When will an index that begins with a foreign key's columns, but includes one or more additional columns, be usable as an RI performance booster?"

Do you know what the answer is to that question? It's not actually answerable in an "It will happen..." way, because the dropping of the referenced restriction is something that did happen -- and happened some time ago, at that.

Before going further with this, a little level-setting is in order. Referential integrity (RI) is, of course, a data integrity protection mechanism. Its aim is to prevent "orphan" rows from showing up in the tables of a relational database management system like DB2; so, if you have a CUSTOMERS table and an ORDERS table, and CUST_NUMBER is a column of both tables, RI helps to avoid a situation in which the CUST_NUMBER column of ORDERS contains a value that does not appear in the corresponding column of the CUSTOMERS table. In this example. CUSTOMERS is the parent table in the RI relationship, and ORDERS is the child table (also referred to as a dependent table). CUST_NUMBER in ORDERS is a foreign key. CUST_NUMBER in CUSTOMERS would be a unique key (and maybe that table's primary key).

DB2-managed RI does what it does in part by preventing deletion of a row in the CUSTOMERS table (continuing with the example introduced previously) if that row's CUST_NUMBER value appears in the CUST_NUMBER column of the ORDERS table (assuming that the behavior specified for the referential constraint is DELETE RESTRICT versus DELETE CASCADE). The "child table check" required to enforce a DELETE RESTRICT rule will perform much better if DB2 can verify existence of a matching foreign key value using an index on the child table, as opposed to having to scan the child table itself. And here is where the aforementioned enhancement comes in. For a long time, the usability of an index on a foreign key for "child check" efficiency depended on the index being defined ONLY on the foreign key. In other words, if you defined on the ORDERS table an index on (CUST_NUMBER, ORDER_AMOUNT), that index would not be usable for improving the performance of a CUST_NUMBER "child check" associated with a DELETE targeting a row in the parent CUSTOMERS table. Here's the wording that appeared in the DB2 for z/OS Version 8 Application Programming and SQL Guide, under the heading, "Indexes on foreign keys" (with emphasis added by me):

To let an index on the foreign key be used on the dependent table for a delete operation on a parent table, the columns of the index on the foreign key must be identical to and in the same order as the columns in the foreign key.

That restriction caused some organizations to have more indexes on certain tables than they wanted. If you had, for example, a table T1, with a foreign key COL1, and you needed an index on COL1, COL2 and also wanted to have an index that would improve the performance of the "child check" of a COL1 value associated with a requested DELETE of a row in the parent table of the RI relationship, you needed an index on COL1 (exact match of the foreign key) as well as the index on COL1, COL2. Kind of a bummer, because every index you define on a table makes all DELETEs and all INSERTs involving the table more expensive, and ups the cost of index-processing utilities such as REORG, RUNSTATS, and LOAD (and of of course ups disk space usage).

Lo and behold, starting with DB2 9 for for z/OS (yeah, that far back), things changed in a good way regarding indexes and foreign keys. Here's the wording from the DB2 9 Application Programming and SQL Guide, under the heading, "Defining a foreign key" (again, emphasis has been added by me):

The index on the foreign key can be used on the dependent table for delete operations on a parent table. For the index to qualify, the leading columns of the index must be identical to and in the same order as all columns in the foreign key. The index can include additional columns, but the leading columns match the definition of the foreign key.

And here's a link to a page containing the same information in the DB2 11 Knowledge Center on the Web: http://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/com.ibm.db2z11.doc.perf/src/tpc/db2z_createindexri.html

If this DB2 enhancement made the scene without you realizing it, don't worry -- you're not alone. What's important is what you do with this information, if it's news to you. First, for a new table to be defined in your DB2 for z/OS environment, note the foreign keys (if any). Then, in determining what indexes should be defined on the table, keep in mind that an index on (COL1, COL2, COL3, ...) will be effective as a performance booster for "child checks" for foreign key COL1 (or foreign key COL1, COL2 -- what matters is that the leading column or columns of the index match the column or columns of the foreign key in question) -- you don't have to define a separate index just on the foreign key to get efficient "child checking."

For existing tables, this could be an opportunity for some index pruning that would save disk space and enhance the CPU efficiency of INSERT and DELETE and utility operations (and UPDATEs of indexed columns). Note a table's foreign keys, see if there are indexes on the table that exactly match those foreign keys, and see if there are other indexes on the table with leading columns that match a foreign key. If you find such a situation, consider dropping the index that exactly matches the foreign key if due diligence shows that doing so would not cause any problems; so, if table T1 has a foreign key COL1, COL2 and an index on (COL1, COL2) and another index on (COL1, COL2, COL3), do some analysis. See if any packages are dependent on the index on (COL1, COL2). If any are, could those packages use the index on (COL1, COL2, COL3) and still get good performance? Is there any other reason why the "exact match" index on the foreign key COL1, COL2 is needed? If not, get rid of it and realize CPU and disk space savings. You might well find that this situation exists for multiple of your tables that were created when the "exact match" rule regarding indexes on foreign keys was still in effect.

Bottom line: the indexes you have should be the indexes you need. What's "needed" can change over time. Take that knowledge and use it.

Wednesday, May 25, 2016

DB2 for z/OS: When Did You Last Check Your Checkpoint Frequency?

I have written in this blog a few entries about various DB2 for z/OS ZPARM parameters, including a post on DSMAX and another on EDM_SKELETON_POOL. Time for another such entry, this time focused on the parameters that determine when a DB2 subsystem will perform checkpoint processing.

Checkpointing is an availability-enhancing mechanism that is related to DB2's deferred write functionality. DB2 externalizes changed pages to disk in a deferred manner so as to boost the efficiency of insert, update, and delete operations (the aim is to allow multiple changes to be made to a given page in memory before that page is written to a table space or index on disk). That's great for scalability, but it has an availability implication: if the DB2 subsystem terminates abnormally (i.e., "crashes"), pages in memory that were updated (and associated with committed data changes) and not yet externalized at the time of the crash will have to be identified (using data in the DB2 log) and processed in the course of DB2 restart (meaning, the corresponding table and index pages on disk will have to be updated accordingly) so that the data on disk will be in a consistent state before the subsystem resumes serving requests from applications. The more of these "pending writes" there are at the time of a DB2 failure, the longer it will take to complete the subsequent restart of the subsystem; thus, there is goodness in getting changed pages externalized to disk in a timely manner.

This is where checkpointing comes in. At regular intervals (based on a user-supplied criterion or criteria) DB2 will identify changed-but-not-externalized pages in the buffer pools and will schedule those pages for externalization via asynchronous write actions. At the time of the next checkpoint, DB2 will determine whether any of the updated-but-not-externalized pages identified at the time of the previous checkpoint are still un-externalized. Those pages (if any) will be written synchronously to disk. And so it goes, one checkpoint after another after another.

What makes for a "good" DB2 checkpoint frequency? That would be one that strikes a proper balance between greater CPU efficiency on the one hand -- allow pages to be updated at least a few times before being externalized to disk, and keep the consumption of resources expended in checkpointing at a reasonable level -- and faster restart on the other hand.

OK, how do you effect a desired frequency of checkpoint processing? For a long time, the only way to influence DB2 checkpoint frequency was to specify (by way of a ZPARM parameter) the number of records that would be written to the DB2 transaction log between checkpoints. The default value for this parameter used to be (if memory serves me correctly) 50,000, and if you went with that value for a subsystem then DB2 would go through checkpoint processing every time 50,000 records had been written to the log since the time of the last checkpoint. For many organizations, this default log-record setting resulted in a pretty reasonable checkpoint frequency.

As time went by, there came to be some problems with both the default value for log-records-between-checkpoints, and for the whole log-based means of regulating checkpoint frequency. The problem with the 50,000 log records threshold is that it came to be way too low at a growing number of sites -- a result of DB2 data-change workloads getting dramatically larger (sometimes thousands of insert/update/delete operations per second). At 50,000 log records between checkpoints, some systems would have DB2 checkpoints occurring every few seconds -- not where you want to be with respect to achieving a good balance balance between CPU efficiency and quick DB2 restart-ability (checkpointing every few seconds would be pretty expensive). The other problem with the log write-based checkpoint triggering threshold was variability -- sometimes extreme variability. At a given DB2 for z/OS site there could be times of the day when a read-dominated workload would lower log write activity to a point that caused lots of time to go by between checkpoints, and other times when high-volume insert jobs (for example) would drive log write levels to the point that checkpointing occurred with too much frequency.

DB2 for z/OS Version 7 delivered a very useful enhancement that enabled the management of checkpoint frequency based on an explicitly time-based threshold: you could tell a DB2 subsystem to checkpoint every X number of minutes, regardless of the volume of log write operations. DB2 10 for z/OS allowed one to specify that checkpointing is to occur after X minutes have elapsed, or after Y records have been written to the DB2 log, since the last checkpoint -- the threshold that's reached first triggers the next checkpoint.

So, where does all this leave us? First, you want to know how many checkpoints are driven on your DB2 subsystems (especially the production subsystems) during busy hours of the day and/or night. This can be easily done using a statistics long report generated by your DB2 monitor. I'd suggest creating a report for a busy daytime period of 1 or 2 hours, and a report for a busy nighttime period of 1 or 2 hours. In the report, find the section under the heading SUBSYSTEM SERVICES (or something similar -- different DB2 monitors might use slightly different wording for headings and fields). In that report section, find the field with a label like SYSTEM EVENT CHECKPOINT, and see how many checkpoints occurred during the time period for which the report was generated. Generally speaking, what you want to see is a checkpoint every 2 to 5 minutes (there is not a "right" or "wrong" place to be in that range -- closer to 5 minutes would make for a somewhat more CPU-efficient DB2 environment, and closer to 2 minutes would enable somewhat faster DB2 restart following an abnormal subsystem termination). If you're out of that range (on either side), an adjustment would likely be warranted.

While you can still manage DB2 checkpointing on a log-records-between-checkpoints basis, and also using both a number of log records and a number of minutes between checkpoints (whichever occurs first since the last checkpoint), my preference is to control checkpoint frequency solely by way of a minutes-between-checkpoints threshold -- it's simple and straightforward, and I like a fixed regularity of checkpoints per hour throughout the day. If you go with a minutes-between-checkpoints threshold, you'll have SINGLE for the value of the CHKTYPE parameter in ZPARM, and the desired number of minutes (again, something in the range of 2 to 5, inclusive, is recommended) specified as the value of the CHKFREQ parameter. By the way, if you go this route -- say, with CHKTYPE=SINGLE and CHKFREQ=5 if you want a DB2 checkpoint every 5 minutes -- then you will have a value of NOTUSED for the CHKLOGR and CHKMINS parameters in ZPARM, as these parameters are used only when checkpoint frequency is managed on both a minutes-between and a log-records-between basis (i.e., when you have CHKTYPE=BOTH). Also, if you see a value of something like 100,000 for the CHKFREQ parameter in ZPARM, that does not mean a checkpoint every 100,000 minutes -- it means a checkpoint after every 100,000 records have been written to the active log (when CHKTYPE is set to SINGLE, a CHKFREQ value between 1 and 60, inclusive, means that checkpoint frequency is time-based, and a CHKFREQ value between 1,000 and 16,000,000, inclusive, means that checkpoint frequency is based on that number of records being written to the DB2 log between checkpoints).

So, check your DB2 system's checkpoint frequency if you haven't in a while, and if you're getting fewer than 2 or more than 5 checkpoints per hour then consider making a change to get checkpoint frequency into the 2- to 5-minute range. Yes, it's technically possible to use log-records-between-checkpoints as a checkpointing threshold, but I'd prefer to make checkpoint frequency a fixed-number-of-minutes-between-checkpoints thing. I like simple, I like straightforward, and I like regularity.

Friday, April 29, 2016

DB2 for z/OS: Should You Do Some Buffer Pool Consolidation?

19, 19, 23, 26.

Know what these numbers have in common? They indicate the number of 4K buffer pools allocated for four different production DB2 for z/OS subsystems that I analyzed over the past couple of years. I believe that there are quite a few sites where a plethora of 4K buffer pools have been defined for a given DB2 subsystem. That's not illegal or anything (DB2 allows you to have as many as 50 different 4K pools for a single subsystem), but it does make the task of managing and monitoring a buffer pool configuration more complex and time-consuming. When I'm in a meeting with an organization's DB2 for z/OS DBAs, and we're looking over information for a DB2 environment, and I see lots of 4K buffer pools defined and I ask about that, I might see sheepish grins and a few eye-rolls, and hear words like, "Well, you know, we just added a buffer pool here and a buffer pool there over the years, for this reason and that, and we ended up with what you see." Recently, I've been pleased to see DBAs at more than one company engaged in consolidation of 4K buffer pools (the number of 8K, 16K, and 32K buffer pools is usually not a matter of concern -- you can only have as many as 10 of each, and most systems I've seen have fewer than that number allocated). These DBAs are reassigning database objects (table spaces and indexes) from lots of smaller 4K pools to a smaller number of larger 4K pools (moving objects from one buffer pool to another got easier with DB2 10, especially for organizations running DB2 in data sharing mode, as I pointed out in a blog entry on the topic I wrote a few years ago). At one site I visited earlier this year, they've taken an interesting approach to managing the size of some of the larger 4K pools they are using for consolidation purposes: they are allocating buffers for these pools in chunks of 524,288. Why? Because that's the number of 4K buffers that can fit into a 2 GB page frame (DB2 11 allows 2 GB page frames to be requested for pools defined with PGFIX(YES), with those very large frames being made available through a specification for the LFAREA parameter in the IEASYSxx member of a z/OS system's SYS1.PARMLIB data set). When the DBAs enlarge one of these pools, it will be enlarged by 524,288 buffers (or a multiple thereof), so as to get one more 2 GB page frame for the pool (or a few more, depending on the size increase).

So, if you're looking to rationalize and simplify the 4K buffer pool arrangement for a DB2 subsystem (or if your organization is just getting started with DB2 for z/OS and you're looking for a reasonable initial set-up -- I got this question a few days ago from a DBA at such a site), what kind of 4K buffer pool configuration might make sense for you? Here are my thoughts on the subject:

  • Ideally, the only database objects (table spaces and indexes) in buffer pool BP0 will be those associated with the DB2 catalog and directory.
  • You should dedicate a 4K buffer pool to the 4K table spaces in the work file database (just as you should dedicate a 32K buffer pool to the 32K table spaces in the work file database). Organizations very often use BP7 for this purpose, because (in a non-data sharing DB2 system) the work file database is named DSNDB07. The "7" in BP7 is then a handy reminder of the role of this buffer pool.
  • You should have a default 4K buffer pool for table spaces used by applications, and a default pool for indexes defined on those table spaces.
  • You might want to have a 4K buffer pool that you would use to "pin" objects in memory (i.e., cache objects in memory in their entirety). You might decide to do this for certain table spaces and indexes that a) are accessed very frequently by important programs and b) are not particularly large (it would take a lot of buffers to pin a really big table space or index in memory). Note that a "pinning" buffer pool should be defined with PGSTEAL(NONE), so as to let DB2 know that you want to use the pool to completely cache objects assigned to it.
  • You might want to have a 4K pool that you would use for monitoring and diagnostic purposes. Suppose, for example, that this pool is BP12, and you have a table space for which you want access activity information. You could then assign that table space temporarily to BP12 and know that the associated information provided by your DB2 monitor or via the DB2 command -DISPLAY BUFFERPOOL(BP12) DETAIL pertains to that one table space. That is a cheaper (in terms of CPU cost) and easier way to get pretty detailed object-specific access activity information versus turning on one or more DB2 performance trace classes.
  • Beyond this basic set-up, you could also consider assigning table spaces that are accessed in a mostly random fashion to a buffer pool that's different from a pool used for table spaces that see a lot of large page scans (and associated indexes would likewise go in two different buffer pools). This would depend, of course, on your being able to determine this division of objects based on predominant access patterns.

All told, you might end up with something like 6 to 8 different 4K buffer pools. I'd prefer that to having 15 or 20 or 25 or more 4K pools. There is often goodness in simplicity.

Thursday, March 31, 2016

Observations on DB2 for z/OS Address Space CPU Utilization

Have you ever looked at a DB2 Statistics Long report generated by your DB2 monitor? If so, you might have seen a section containing information about the CPU consumption of the various DB2 address spaces. It would look something like the example below, which reflects activity over a one-hour period of time (similar information might be available as well via an online display provided by your DB2 monitor). Note that I’ve abbreviated address space names and reduced the precision of CPU time figures (from microseconds to milliseconds) to enable the information to fit within the width of this blog space.

CPU TIMES  TCB TIME  PREEMPT SRB  NONPREEMPT SRB  PREEMPT IIP SRB
---------  --------  -----------  --------------  ---------------
SYS SVCS     23.203    13:18.791          10.464           55.127
DB SVCS      58.468     1:26.687          17.225        13:49.714
IRLM          0.006        0.000        6:00.679            0.000
DDF           5.741     9:36.358           2.596        12:23.363


An address space's total CPU time for the reporting interval is the sum of the numbers in the corresponding row of the tabular display; so, for example, the total CPU consumption for the DB2 database services address space, based on data shown above, is:

58.468 + 1:26.687 + 17.225 + 13:49.714 = 16:29.094 (16 minutes, 29.094 seconds)

The first column of numbers, labeled TCB TIME, shows the CPU consumption of address space processes represented in the system by TCBs, or task control blocks. TCB CPU time is always consumed on general-purpose processors (aka general-purpose "engines"). The second column, labeled PREEMPT SRB, shows the CPU time, for address space processes represented by preemptible SRBs (Service Request Blocks), that was consumed on general-purpose engines. Work done under preemptible SRBs is generally zIIP-eligible (i.e., eligible for execution by a zIIP engine - a specialty processor that provides relatively lower-cost MIPS), to varying degrees depending on the type of work being done. Work, done under preemptible SRBs, that is not executed by zIIP engines is executed by general-purpose engines. The third column, labeled NONPREEMPT SRB, shows CPU consumption of address space processes represented by non-preemptible SRBs. This work, like work done under TCBs, is always executed by general-purpose engines. The fourth column, labeled PREEMPT IIP SRB, shows the CPU time, associated with processes running under preemptible SRBs, that was consumed on zIIP engines.

In this blog entry, I want to point you towards some observations about DB2 address space CPU utilization figures that you might see for DB2 subsystems at your site.

IRLM - lean and mean

The IRLM address space - responsible for lock management in a DB2 for z/OS environment - typically consumes a very small amount of CPU resource, even when the volume of lock and unlock request activity is very high. For the system from which the CPU times shown above came, the six minutes of IRLM CPU time fueled execution of 128 million lock and unlock requests during the one-hour reporting interval. This great CPU efficiency is a key reason why you shouldn't hesitate to put IRLM where it belongs, priority-wise, in a z/OS LPAR's WLM policy: in the super-high-priority SYSSTC service class. IRLM doesn't use much CPU, but when it needs an engine it needs one RIGHT AWAY; otherwise, lock acquisition and release actions are delayed, and the whole DB2 workload slows down. [By the way, I would not recommend assigning DB2 address spaces other than IRLM to the SYSSTC service class. The other DB2 address spaces - MSTR, DBM1, DIST, and any stored procedure address spaces - should all be given the same priority, and that priority should be below SYSSTC and above address spaces, such as CICS regions, in which application programs run.]


System services (aka MSTR) - the thread factory

The DB2 systems services address space handles various functions. Of these, a major driver of CPU utilization is thread creation and termination. On the DB2 system for which address space CPU times are shown above, the dominant component of the workload during the reporting interval was CICS-DB2 (a little over 500 transactions per second, on average, during the one-hour time period). For this DB2 workload component, it so happens that there was very little in the way of CICS-DB2 thread reuse (the thread reuse rate was about 2%). That being the case, with the high transaction rate the MSTR address space was kept pretty busy creating and terminating hundreds of CICS-DB2 threads per second. If the CICS-DB2 thread reuse rate were to be made considerably higher through the use of a few protected entry threads for the most frequently executed transactions, it's likely that the CPU time for the DB2 system services address space, which was 14 minutes and 47.585 seconds for the system portrayed in the report snippet above, would have been a considerably smaller value.

Database services (aka DBM1) - readin' and writin'

Like the system services address space, the DB2 database services address space performs a variety of functions. Two functions that account for a lot of the address space's CPU consumption are prefetch reads and database writes. Prefetch read operations (referring to the combined total of sequential, list, and dynamic prefetch reads) often greatly outnumber database writes - sometimes by 10 to 1, sometimes by 20 to 1 - in a transactional application environment, as transactional work is often read-heavy (batch workloads are sometimes relatively more write-heavy), so your main leverage point in reducing DBM1 CPU consumption will typically be reducing the rate of prefetch reads in the system. That goal, in turn, is generally achieved via enlargement of buffer pools that see a lot of prefetch reads. Note that the bulk of DBM1's CPU consumption is associated with zIIP processors (in the report snippet above, that's the 13 minutes and 49.714 seconds seen for the database services address space in the column with the heading PREEMPT IIP SRB - about 84% of DBM1's total CPU time). This is so because, starting with DB2 10 for z/OS, prefetch read and database write operations became 100% zIIP-eligible. Because reducing prefetch reads will reduce zIIP engine utilization, does that mean it's not important? No, it doesn't mean that. Reducing zIIP engine utilization can be important, especially as a means of avoiding zIIP engine contention problems.

DDF - SQL-driven

The DDF address space (also known as DIST) is interesting in that its CPU consumption is largely driven by execution of SQL statements that get to DB2 by way of DDF. Referring to the report snippet above, the CPU times for DDF in the TCB TIME and NONPREEMPT SRB columns - about 8 seconds of the address space's total of a little over 22 minutes of CPU time - reflect activity performed by DDF "system" tasks. The rest of the DDF CPU time, consumed by processes represented by preemptible SRBs, is associated with execution of SQL statements issued by network-attached DB2-accessing applications (and that includes sending query result sets back to clients). The more SQL that gets to DB2 through DDF, the higher DDF's CPU consumption will be (just as the CPU time of a CICS region is affected by the cost of executing SQL statements that get to DB2 via that address space). Here's something else to note: the CPU time split between general-purpose engines and zIIP engines for DDF work done under preemptible SRBs. Using numbers from the report snippet above, you can see that this split is about 56% zIIP and 44% general-purpose-engine time (the figure for the zIIP offload percentage for the DDF address space is the time under the PREEMPT IIP SRB column for DDF divided by the sum of the times in the PREEMPT SRB and PREEMPT IIP SRB columns). Execution of SQL statements running under preemptible SRBs in the DDF address space is up to 60% zIIP-offload-able, and I'd say that a zIIP offload percentage in the 55-60% range is good. If you see a split such that the DDF CPU time associated with work done under preemptible SRBs is less than 55% zIIP time (i.e., if PREEMPT IIP SRB time for DDF divided by the sum of PREEMPT SRB and PREEMPT IIP SRB time is less than 55%), check to see if you have a zIIP engine contention issue (check the blog entry on zIIP engine conetntion pointed to by the hyperlink above).

So, looked lately at DB2 address space CPU times in your environment? Check 'em out, and see what conclusions you can draw. I hope that the information provided via this blog entry will be useful to you.

Sunday, February 28, 2016

DB2 for z/OS: When Did You Last Check Your Subsystem's DSMAX Value?

As a DB2 for z/OS system grows and evolves, the value of certain ZPARM parameters should be reviewed to ensure that they are still appropriate in the context of the current environment. Failure to do this can result in a situation in which the operational efficiency (or some other characteristic) of the system is negatively impacted. One of these ZPARMs that should be periodically revisited is DSMAX.

DSMAX specifies the maximum number of data sets (these would be associated with table spaces and indexes) that can be open and allocated to DB2 at one time. For a long time, the maximum value you could specify for DSMAX was 10,000. That changed, about 15 years ago, with DB2 Version 6, which allowed the DSMAX specification to be as high as 32,767. The upper bound of DSMAX was further increased, to 65,041, with DB2 Version 8. It went to 100,000 with DB2 9; and to 200,000 with DB2 10. The default value of DSMAX has also steadily increased: it was 3000 with DB2 Version 7; went to 9960 with DB2 Version 8; and to 20,000 with DB2 10.

The rise in maximum and default DSMAX values was warranted by the increase in the number of data sets associated with a typical DB2 system, and THAT phenomenon owes to a number of factors, including growth over time in the size of DB2 for z/OS-managed databases, more table spaces for a given number of tables (a universal table space can hold one and only one table), increasing use of DB2 for z/OS as a database management system for vendor-supplied applications (several of which are characterized by databases that contain a great many objects), more tablespace partitions (DB2 Version 8 introduced table-controlled partitioning, which allows a table to have up to 4096 partitions, versus a minimum of 254 partitions for an index-controlled partitioned table space), and more index partitions (DB2 Version 8 table-controlled partitioning enabled partitioning of more than one index on a partitioned table).

Here's why you should periodically check your DSMAX value and its effect on a DB2 subsystem's operational efficiency: when the number of data sets open and allocated to a DB2 subsystem reaches 99% of that subsystem's DSMAX specification, DB2 will physically close some open data sets (300 of them, or 3% of DSMAX, whichever is smaller). Which data sets will be closed? Well, DB2 will first look for data sets belonging to objects defined with CLOSE YES, and will select some of those for closing on a least-recently-used basis. If these data set close operations do not sufficiently reduce the number of open data sets, DB2 will close data sets belonging to objects defined with CLOSE NO, again making selections on a least-recently-used basis. Some level of this data set close activity is typically not a problem, but too much can negatively impact system performance. I've seen a number of situations in which an overly-high level of DB2 data set close activity has been overlooked (this can lead to more data set open activity than you'd like, as data sets closed for DSMAX reasons have to be re-opened when next accessed).

Where can you go to get a handle on data set close activity in your DB2 environment? That's easy: just get your hands on a DB2-monitor-generated statistics long report (ideally, one that captures activity for a particularly busy hour of the day) or an online display of DB2 subsystem statistics, and locate the information pertaining to "open/close activity." There you will find a number of useful fields. Among these is the high-water mark for the number of open DB2 data sets. If you see there a figure significantly larger than the DSMAX value for the DB2 subsystem, that's telling you something. More important, take a look at the field labeled DS CLOSED-THRESH REACHED (or something similar -- different DB2 monitor products can have slightly different labels for the same field). That shows you the number of times that DB2 data sets were physically closed due to the DSMAX threshold being reached. If that figure indicates a data set close rate of more than 1 per second (e.g., if you see that data sets were closed due to the DSMAX threshold being reached more than 3600 times in a one-hour elapsed time interval), my recommendation would be to increase the value of DSMAX.

How high should the DSMAX value be for a DB2 subsystem? Well, a value high enough to get the rate of data set close operations to the single digits per minute (or even to zero) is nice, but there is such a thing as too high; see, the increases in DSMAX maximum and default values that I mentioned previously were enabled largely by z/OS enhancements that reduced the amount of below-the-line (referring to the 16 MB level) virtual storage required for each open data set. The below-the-line virtual storage needed for an open data set, while smaller than before, is not zero. The 200,000 maximum DSMAX value allowed in a DB2 11 system might best be thought of as a theoretical limit -- the actual maximum value that would work in your environment would likely be considerably less than 200,000. If your current DSMAX value is resulting in a higher-than-desired rate of data set close operations, consider bumping the value up by 1 or 2 thousand to see if that gets the close rate down to where you want it to be. Generally speaking, I'd say that a value of up to 30,000 for DSMAX should not be problematic. If you want to go beyond that figure, you probably first ought to get a good idea as to the availability of virtual storage below the 16 MB line for the DB2 system of interest. More information on calculating an appropriate DSMAX value can be found in the DB2 for z/OS Installation and Migration Guide.

To sum up the message of this blog entry, I'd urge you to first check, using your DB2 monitor (I particularly like to use monitor-generated statistics long reports) the rate of data set close activity resulting from the DSMAX threshold being reached for particular DB2 subsystems at your site. If that close rate exceeds 1 per second, bump the value of DSMAX up by something in the range of 1000 to 2000. If you feel a need to take DSMAX above 30,000, proceed with caution and with, ideally, good knowledge of the availability of below-the-line virtual storage in the target system.

Stay on top of this going forward. Don't let "data set creep" cause an overly high level of data set close activity for your DB2 subsystems.

Sunday, February 21, 2016

Statement-Level Control of DB2 for z/OS Query Parallelization

I'm writing about this now because of the increased interest I've seen, at various DB2 for z/OS sites, in using DB2 query parallelization as a means of boosting zIIP engine utilization when -- and this often happens during nighttime hours -- workloads that typically drive consumption of zIIP MIPS (e.g., DB2 DDF transactions, or Java programs running in WebSphere Application Server for z/OS) diminish. During periods of heavy overnight batch processing, people might observe very high utilization levels for general-purpose engines in a z/OS LPAR, while zIIP engines in the system are not at all busy. It might be nice, in such cases, if one could shift some batch work from general-purpose to zIIP engines, but how might that be accomplished (and here I'm talking about batch programs written in a language such as COBOL, versus Java -- Java batch programs, like any Java programs in a z/OS environment, would be zIIP-eligible)? Getting some DB2 query parallelization going for batch programs would be a means of moving a portion of their processing from general-purpose to zIIP engines, as the "child tasks" associated with a parallelized query can have up to 80% of their processing handled by zIIP engines. Problem was, getting this done required, until somewhat recently, that a batch program's DB2 package be bound with DEGREE(ANY). Binding a package with that option makes ALL of the program's queries candidates for parallelization. What if you only want a particular statement in the package to be parallelize-able?

People contemplating utilization of query parallelization for dynamic queries faced a similar conundrum. They could set the value of the DB2 ZPARM parameter CDSSRDEF, which establishes the default value of the CURRENT DEGREE special register, to ANY, and that would enable parallelism for dynamic queries, but it would make ALL dynamic queries candidates for parallelization by default. That might be OK for a DB2 subsystem dedicated to an analytics workload, but it probably isn't the situation you want for a DB2 subsystem that supports high-volume transaction and/or batch applications. True, setting the value of CURRENT DEGREE to ANY prior to issuance of a dynamic query by a program, and then changing CURRENT DEGREE back to 1, would be a way to limit parallelization to a particular query, but changing the CURRENT DEGREE special register value is not always going to be feasible. Suppose a query is generated by an analytics tool? How would CURRENT DEGREE be set in that case? What if the dynamic query of interest is issued by an application purchased from a vendor? That's not a scenario that would lend itself to changing a special register like CURRENT DEGREE.

And I've just been talking here about making a query parallelize-able by DB2. What about the degree of parallelization? The PARAMDEG parameter in ZPARM allows you to control the degree to which a parallelized query can be split, but that's at the DB2 subsystem level. A low degree of parallelization (e.g., 4) might be appropriate if the aim of parallelization is getting some work shifted from general-purpose to zIIP engines, and there is a desire to keep the number of parallel tasks for split queries from getting very high, but would that be a good degree of parallelization if the objective is to get a query generated and submitted via an analytics tool to complete as quickly as possible? Would not a higher degree of parallelization better support that aim?

These formerly sticky problems -- being selective about parallelization for both static and dynamic queries, and having a right-for-the-moment degree of parallelization -- were very nicely addressed via a new catalog table, SYSQUERYOPTS, and related enhancements that were delivered with DB2 10 for z/OS. Although DB2 10 has been out there for more than five years, I've found that SYSQUERYOPTS escaped the notice of many a DB2 DBA. I'm telling you now: notice this. It gives you statement-level control over both parallelization and degree of parallelization.

How is this done? It's pretty easy: you create, for your user ID, a DSN_USERQUERY_TABLE (the DDL for this table, and descriptions of the table's columns, can be found online in the Knowledge Center for DB2 10 and DB2 11). Then you insert into that table a row for the query for which you want to provide values for parallelization (i.e., whether the query is a candidate for parallelization, and the degree of parallelization). This inserted row will include the text of the query that is to be parallelized. You could include the query text in the insert statement that populates DSN_USERQUERY_TABLE, but it's recommended that, for static SQL, you instead pull the statement text from the SYSPACKSTMT catalog table (an example of an INSERT statement that targets DSN_USERQUERY_TABLE and gets a query's text from SYSPACKSTMT can be found in the Knowledge Center for DB2 10 and DB2 11). Similarly, for a dynamic query for which you want to provide parallelism parameters, it's best to get the statement text from the DB2 dynamic statement cache (this text, for each statement in the cache, is placed in the DSN_STATEMENT_CACHE_TABLE when you issue EXPLAIN STMTCACHE ALL). In addition to the query text, the row you insert into DSN_USERQUERY_TABLE will contain the value 'ANY' in the DEF_CURR_DEGREE column (indicating that the query is a candidate for parallelization by DB2) and some integer value in the MAX_PAR_DEGREE column (this is your specification of the maximum degree of parallelism for the query). You can also indicate that these parallelism specifications (i.e., that this query is a candidate for parallelism, and, if parallelized, can be split into as many as X number of child queries) apply to the statement whenever it is issued, or only when it is issued in association with a certain package in a certain collection. After inserting this row into your DSN_USERQUERY_TABLE, you would issue the BIND QUERY command, and that would cause rows to be inserted into the SYSQUERY and SYSQUERYOPTS catalog tables (another catalog table introduced with DB2 10, SYSQUERYPLAN, can also be populated via execution of the BIND QUERY command, but that's done when you want to provide an access path for a query, and that's not the subject of this blog entry).

After you've executed the BIND QUERY command (and you might issue the command a second time, with the LOOKUP(YES) option specified, to verify that the first issuance of the command populated SYSQUERY and SYSQUERYOPTS as expected), the parallelism options you selected for the target query will go into effect when the query is next prepared for execution. That will happen, for a static query, when the package of which the statement is a part is rebound, and for a dynamic query when the statement next goes into the dynamic statement cache.

A couple more comments. First, for all this to work, the value of the OPTHINTS parameter in ZPARM has to be YES (the default value is NO). Second, in addition to providing a means of activating and controlling query parallelism at the statement level, the SYSIBM.SYSQUERYOPTS table (and SYSIBM.SYSQUERY and DSN_USERQUERY_TABLE and the BIND QUERY command) enables specification of the REOPT bind option and activation of star join functionality at the statement level.

That's it in a nutshell. Through the mechanism I've described herein, a static SELECT issued by a nighttime batch program might be parallelized and split 4 ways, while on the same DB2 system a particular dynamic query generated by an analytics tool is parallelized and split 20 ways when it is executed in the middle of the online day. Such fine-grained control of query parallelism -- determining the individual statements for which it can be used, and the associated degree of parallel processing -- could open up new use cases for parallelism in your environment. Give it some thought.