Monday, November 24, 2014

DB2 for z/OS: How Low Can Synchronous Read Wait Time Go?

Short answer: lower than I'd previously thought.

Short explanation: solid-state drives.

OK, backstory: I was recently analyzing some performance information related to a client's production DB2 for z/OS environment. Included in the documentation provided by the client were some Accounting Long reports generated by their DB2 monitor (these reports format data contained in DB2 accounting trace records). As I typically do when reviewing such reports, I checked out the wait time per synchronous read I/O operation: I located the "Class 3 wait time" heading in the report (data therein comes from DB2 accounting trace class 3), found the line for "Database I/O" under "Sync I/O," noted the "Average time" (the average time, per accounting record, spent waiting on single-page, on-demand read I/Os to complete), and divided that figure by the "Average event" number in the same line (that's the average number, in this case, of synchronous read I/O operations per accounting record). The quotient thus obtained gives you wait time per synchronous read. On seeing this number, I did a double take.

Back in the mid-1990s, at an IBM storage systems conference, I delivered a presentation that provided a DB2 for z/OS perspective on I/O performance. In that presentation, I mentioned that getting average wait time per synchronous read below 30 milliseconds was a good thing, and that getting this time to 20 milliseconds was a great thing. Not long after that, large cache memory sizes became the norm for enterprise storage subsystems, and synchronous read wait times dropped sharply at sites where this technology was put to use (I recall being surprised when I first saw, in the late 1990s, an average wait time per synchronous read number that was below 5 milliseconds). As time passed, more performance-boosting technologies appeared on the mainframe storage systems scene, including faster channels, parallel access volumes (a z/OS feature that enabled multiple I/O operations targeting a single disk volume to execute concurrently), and increasingly sophisticated algorithms for managing disk controller cache memory resources. By around 2010, I was seeing average wait time per DB2 for z/OS synchronous read go as low as 1 millisecond.

The wait time per synchronous read that I saw in the data I reviewed, as previously mentioned, last week? 0.25 milliseconds. That's 250 microseconds. "Your DB2 synchronous read I/Os are screaming. How do you get numbers like that?" I asked one of the client's DB2 for z/OS DBAs. "Oh," he said, "that would be the solid-state drives."

So there you have it. I certainly knew that solid-state drives (SSDs) were on the market (IBM is one of the vendors of SSDs), but I hadn't yet seen the effect of a wide-scale deployment of the technology at a DB2 for z/OS site (in some cases SSDs are used in a niche fashion for a relatively small percentage of an organization's data assets). Since SSDs don't have the actuators ("arms") associated with traditional spinning disk drives, so-called seek-time delay (the time it takes to move a read/write "head" to a particular track on a disk) is largely eliminated, and that contributes to substantially improved read I/O performance, especially for transactional workloads characterized by random reads.

It's all part of a big picture that continues to change (and for the better). Making synchronous reads go as quickly as possible is still, of course, an important aspect of optimizing DB2 for z/OS I/O performance. In that regard, utilization of SSDs is becoming a more attractive option as the technology becomes more cost-competitive relative to spinning-disk storage systems. The other key to driving down I/O wait times is to eliminate them altogether via larger DB2 buffer pools. And folks, when I say larger, I mean MUCH larger than a lot of you have been thinking. It's time to go big -- really big -- in the buffer pool department. I'll have more to say on that topic in an upcoming blog entry, so stay tuned.

Wednesday, October 29, 2014

DB2 for z/OS: Thoughts on Table Space and Index Page Size Selection

The other day I got a note, from a DB2 for z/OS DBA, in which I was asked for my thoughts regarding the selection of page size for a table space. I typed out what I hoped would be a useful response, and found myself thinking, as I sometimes do, "Hey -- I could turn this into a blog entry." And so I have. I'll start here by offering some guidelines on page size selection for table spaces, and then I'll shift to the index perspective on the matter.

Table spaces

A table space (or index) gets its page size from the buffer pool to which it is assigned; so, a table space assigned to a buffer pool with buffers of 8K bytes in size will have pages of 8K bytes in size.

First, and obviously, the row length of a table is an important factor in selecting the page size for the associated table space. Row length determines the minimum page size that you can use for a table space. If the rows of a table are 7000 bytes in length, the pages of the associated table space will have to be at least 8K bytes in size, because a table row cannot span pages of a table space (an exception to this rule: a long LOB or XML value would span multiple pages in a LOB table space or an XML table space).

Once you've determined the minimum possible size for the pages of a table space, you can think about whether or not a page size larger than the minimum would be a good choice. Why might you want to go with a larger-than-minimum page size? One reason would be optimization of disk and server memory space utilization. Suppose that the row length for a table is 5000 bytes. The minimum page size for the associated table space would be 8K bytes; however, as rows can't span pages, that page size would waste quite a bit of space on disk and in memory (in each page you'd have 5K bytes occupied by a table row, and the remaining 3K bytes of space would be empty). In this case, a 16K page size would boost space utilization considerably: 15K bytes in each page could be occupied by 3 rows, leaving only 1K bytes of unused space). If you're wondering whether data compression would change this calculus, the answer is probably not -- that is to say, the right page size for an uncompressed table space is probably the right page size to use if you want to compress the table space; so, base page size selection on the uncompressed length of a table's rows.

Here's another reason to consider a page size that is larger than the minimum possible size for a table space (which, again, is determined by uncompressed row length): data access patterns. If data in a table space is typically accessed sequentially (i.e., if sequential scans are the rule for a table space, versus random single-page access), a larger page size could mean fewer GETPAGEs for those sequential scans, and that could boost CPU efficiency for applications that drive the sequential scans. Note that "sequential scan" doesn't necessarily mean a table space scan. It could be a matching index scan that would drive dynamic prefetch.

Just as access patterns for data in a table could lead you to select a larger-than-minimum page size for a table space, so they could push you in the other direction. When access to data in a table space is dominated by random, single-page reads, going with the minimum possible page size is probably the right move, so as to maximize buffer pool effectiveness -- in particular when SELECT statements targeting a table tend to have single-row result sets, you don't want to use an overly-large page size, because larger pages would bring into memory many rows that are not needed by a query, in addition to the one row that is needed (exception: if singleton SELECTs are executed via a "do loop" in a batch program, each time issued using a search value obtained from a file that is ordered in such a way that access to the target table is in fact sequential, dynamic prefetch would be expected to kick in and a larger page size could therefore be an efficiency-booster).

One more consideration: if you want to utilize LOB in-lining for a table (a DB2 10 for z/OS new-function mode feature), that could affect your table space page-size decision. Suppose a table has a LOB column, and the length of the table's rows without the LOB values would make 4K bytes a good page size for the associated base table space. If you determined that a high percentage of the table's LOB values could be completely in-lined with (for example), a 16K page, and if that inlining would deliver performance benefits for you, you might go with that 16K page instead of the minimum 4K page. You can read more about LOB inlining in an entry that I posted to this blog a couple of years ago.


Prior to DB2 9 for z/OS, there was one choice for the page size of an index: 4K bytes. Starting with DB2 9 in new-function mode, the page size for an index could be 4K, 8K, 16K, or 32K bytes. Why might you go with a larger-than-4K page size for an index on a DB2 table? One reason would be to enable index compression -- something that requires the use of a greater-than-4K index page size (as noted in an entry I posted a few years ago to the blog that I maintained while working as an independent DB2 consultant).

Compression isn't the only reason to consider a larger-that-4K page size for an index. Another motivator could be a desire to reduce index page split activity. If an index is defined on a key whose values are not continuously-ascending (i.e., if key values for rows newly inserted into the underlying table are likely to be less than the key's current maximum value), throughput for high-volume insert operations could be constrained by index page split activity. [Unlike rows in a table, entries in an index MUST be physically ordered by key value, and that leads to index page splits when new key values are inserted into the "middle" of an index, as will be the case for an index defined on a non-continuously-ascending key -- if the page into which a new index entry must go is full, the page will be split, and a portion of the page's entries will be moved to a previously empty page in the index.] Index page splits can have a particularly significant impact on insert throughput in a DB2 data sharing environment, owing to an attendant increase in log write activity. When index pages are larger, index page split activity tends to decrease; so, if a primary concern is optimized throughput for high-volume insert operations, an index page size of 16K or 32K bytes could be preferred over the traditional 4K size.

Larger index page sizes can also improve the performance of index scans by reducing associated GETPAGE activity. On top of that, larger page sizes could reduce the number of levels for an index (referring to the root page level on top (logically speaking), the leaf page level on the bottom, and one or more non-leaf levels in-between in the B-tree index structure), and that would in turn result in reduced GETPAGE activity for index probes.  

On the other hand, if compression is not desired for an index, and if the primary performance concern is optimization of data retrieval involving index access, and if access to entries in an index tends to be random (versus sequential) in nature, a 4K page size is probably your best choice.

A note on changing an object's page size

If you implement a table space or an index with a certain page size and you later determine that a different page size would deliver better performance, you can make that change by way of ALTER TABLESPACE or ALTER INDEX, with a buffer pool specification that would reassign the object to a buffer pool with larger- or smaller-sized buffers, as desired. Starting with DB2 10 for z/OS running in new-function mode, a change to an object's page size is a pending DDL operation that can be non-disruptively accomplished with an ALTER (as mentioned above) followed by an online REORG of the target object. Note, however, that page-size changing via pending DDL is only possible if the table space being altered (or the underlying table space on which a to-be-altered index is defined) is of the universal variety -- one of several incentives to get to universal table spaces (prior to DB2 10, a table space could not be altered to have a different page size, and altering an index to change the page size would place the index in rebuild-pending status).

In conclusion...

Put some thought into your table space and index page-size decisions, and know that with universal table spaces (and indexes defined thereon), you can pretty easily change your mind down the road.

Monday, October 20, 2014

DB2 for z/OS Stored Procedures: Native SQL, or Java?

Recently, a mainframe DB2 DBA put a question to me: when should DB2 for z/OS native SQL procedures be used versus Java stored procedures, and vice versa?

Why he asked: the DBA worked with a group of developers who, collectively, could code both of these DB2 stored procedure types. Within this group, individuals tended to go with what they were used to. In other words, the decision to code a native SQL or a Java stored procedure tended to be based more on momentum (e.g., "I used a Java stored procedure last time, so I'll go that route again this time") than on consideration of factors that might make one type of stored procedure more appropriate than the other for a given situation. The DBA wanted to provide the developers with guidelines that would help them to make more informed decisions regarding the use of native SQL versus Java stored procedures. In response to this request, I communicated my take on the matter, and via this blog post I'll share those thoughts with you.

I'll start out with this statement: if I needed to create a DB2 for z/OS stored procedure, I'd go with a native SQL procedure unless I needed to do something that couldn't be done (or couldn't be done in a practical sense) with that type of stored procedure. If I did have a requirement to use an external stored procedure (i.e., one with an external-to-DB2 executable that would run in a stored procedure address space), I'd look at Java as a good choice for the associated programming language, unless I needed to do something that would best be done with a COBOL stored procedure program (more on this momentarily).

Why my "native-first" mind set? Three reasons:

1) Native SQL procedures can be developed (and maintained) by lots of different people. You don't have to be proficient in a programming language such as Java (or COBOL) to create a native SQL procedure. If you know SQL, you can quickly pick up the "control" SQL statements (referring to logic flow control -- statements such as ITERATE, LOOP, GOTO, etc.) that enable coding of a SQL PL routine such as a native SQL procedure ("SQL PL" is short for SQL procedure language, the language used in a DB2 environment to write stored procedures and user-defined functions using only SQL). On top of SQL PL being a pretty easy means of writing DB2 stored procedures, the free and downloadable IBM Data Studio product provides a user-friendly GUI tool for native SQL procedure development and debugging.

2) Native SQL procedures, when called by DRDA requesters (i.e., through DDF), are zIIP-offloadable. More specifically, zIIP-offloadable to the tune of about 60%. That's because they run under the caller's task and a DRDA caller's task is an enclave SRB in the DDF address space. A Java stored procedure, like any external stored procedure, will always run under its own TCB in a stored procedure address space. Yes, execution of that stored procedure's Java code will be zAAP- or zIIP-eligible (the latter via what's called zAAP-on-zIIP, which you get when zAAP-eligible work runs on a system that has zIIP engines but no zAAP engines), but execution of the SQL statements issued by the Java stored procedure will not be zIIP-eligible (you'd actually get a little zIIP offload for these SQL statements, but not what you'd get with a native SQL procedure).

3) Native SQL procedures are where you're likely to see the bulk of DB2 for z/OS stored procedure functionality enhancements. Example: with DB2 10, the XML data type can be used for a stored procedure input or output parameter, but only for a native SQL procedure. Another example: DB2 11 introduced the autonomous stored procedure (it has a separate unit of work from that of the calling process, so that a data change made by an autonomous procedure will persist even if the calling transaction subsequently fails and is rolled back by DB2). Only a native SQL procedure can be autonomous. Yet another example: DB2 11 delivered support for array-type input and output parameters for a stored procedure, but only if it's a native SQL procedure.

When might I use an external stored procedure (such as a Java stored procedure)? I'd go this route if I needed to do something that couldn't be done (either at all, or in a practical sense) with a native SQL procedure. Maybe there is a need for a level of sophistication in the logic of the stored procedure that I can't implement with SQL PL (though in that case I might ask, "How sophisticated does the logic in a data-layer program have to be?"). I might go with an external stored procedure (such as a Java stored procedure) if there were a need to access resources (data or services) outside of DB2 (a native SQL procedure can issue SQL statements, period). Note that even in that case, there would be capabilities available in a native SQL procedure that might provide the functionality required. Keep in mind that native SQL procedures can utilize DB2 functions, which include things like MQRECEIVE and MQSEND if there is a need to interact with WebSphere MQ, and SOAPHTTPNV if I need to access a Web service. I might take the Java path if I wanted to use the same (or nearly the same) Java stored procedure program in a DB2 for z/OS and a non-DB2 relational database environment -- a Java stored procedure likely wouldn't have to be much different in the two environments, whereas a SQL PL routine created for DB2 would have to be re-created in the SQL programming language of a different relational database server in order to function properly in that non-DB2 environment (a native SQL procedure provides portability within the DB2 family of relational database systems).

If I wanted to use an external stored procedure, Java would be a fine choice for the programming language. There was a time, not so long ago, when I was not a fan of Java code running on mainframe servers, but now I'm a proponent of Java in a z/OS environment, as I noted in an entry recently posted to this blog. As pointed out in said blog entry, DB2 11 for z/OS delivered some important enhancements that boost scalability and CPU efficiency for Java stored procedures.

Are there circumstances that would lead me to choose COBOL instead of Java for an external stored procedure? Yes. One scenario that comes to my mind involves invocation of an existing COBOL subroutine. That could be done very efficiently via a COBOL CALL from a COBOL stored procedure (I wrote of a real-world example of this scenario in a blog entry I posted this past spring). I might also lean towards COBOL over Java if, in my organization, there were COBOL-knowledgeable developers but not Java-knowledgeable developers available at the time I needed a DB2 external stored procedure program to be coded.

The bottom-line message is this: SQL PL (for native SQL procedures) and Java (especially in a DB2 11 for z/OS system) are both very viable choices when it comes to development of DB2 stored procedures. Rather than taking a "what we're used to" approach to choosing one versus the other, consider the relative strengths of each language in light of the objectives you have for a given stored procedure. A thoughtful and informed choice is likely to be the right one for you.

Tuesday, September 30, 2014

DB2 for z/OS: DDF, Stored Procedures, and SET CURRENT PACKAGESET

Recently, I was talking with people from a company's IT department about modernizing their mainframe application architecture.Among other things, we discussed increasing the company's use of stored procedures for DB2 for z/OS data access. That prompted some questions around the use of the DB2 SQL statement SET CURRENT PACKAGESET in stored procedures called by DRDA requesters (these being calls that would get to DB2 via the distributed data facility, or DDF). Why did this issue come up? Because the organization's DB2 database had been divided into several schemas (sets of tables) along customer lines, with information pertaining to one set of customers being stored in tables in schema A (e.g., in a table called A.ORDERS), and data for another customer set going into schema B tables (such as B.ORDERS), and data for yet another customer set going into schema C tables, etc. This is not an unusual approach in environments characterized by very large amounts of data that can be relatively easily separated according to some identifier such as customer or account number. Such a mode of data organization can deliver some scalability and availability benefits, and if the number of database "instances" (referring to the different schemas, distinguished by high-level qualifier) doesn't get to be too large (I would want it to be in the single digits), the cost in terms of operational complexity is generally not too high. Does the presence of "n" database instances -- same tables, structurally, in each instance, but with different data -- require the coding of "n" versions of each database-accessing program? No. Programs are coded using unqualified table names in SQL statements, and then the packages associated with the programs are bound into "n" collections, one for each database instance, with the appropriate high-level qualifier provided via the QUALIFIER option of the BIND PACKAGE command. At execution time, application programs can then very easily and efficiently navigate between database instances by pointing to an instance's package collection using the SQL statement SET CURRENT PACKAGESET.

Nothing new here -- this kind of thing has been done for years at plenty of sites. The concern raised during the aforementioned meeting had to do with the use of SET CURRENT PACKAGESET in DB2 for z/OS stored procedures called through DDF. One of the DBAs with the company to which I've referred expressed doubt that SET CURRENT PACKAGESET could be used in this situation. The reason for this DBA's skepticism? He was pretty sure that the value used in a SET CURRENT PACKAGESET statement had to be a collection name in a plan's package list (often called the PKLIST). For DDF-using applications, the plan name always defaults to DISTSERV, and that plan doesn't have a package list; therefore (the DBA believed), SET CURRENT PACKAGESET, when issued from a stored procedure called by a DRDA requester, will fail.

This DBA even had what appeared to be solid evidence backing his contention: he created a stored procedure (it happened to be of the native SQL variety -- more on that to come), called ST_PROC_X, through which a SET CURRENT PACKAGESET = 'COLL_A' statement was issued, and when that stored procedure was called through DDF at DB2 location DB2LOCN, the SET CURRENT PACKAGESET failed with a -805 SQL error code (I've of course changed location, collection, and other names from the actual values used at the DBA's site):


There, see? "Not found in plan DISTSERV." That means the program's DBRM wasn't bound directly into the DISTSERV plan, nor was it in a collection in DISTSERV's package list. And of course that must be the case, because a DBRM can't be bound directly into a plan in a DB2 10 (or later) environment (and even before DB2 10 you couldn't bind any DBRM's into plan DISTSERV), and you can't define any package collections for DISTSERV; therefore, SET CURRENT PACKAGESET is doomed to fail for DRDA requesters. Case closed, right?

Umm, no. I will say, however, that the DBA's conclusion was very understandable, given the text that accompanied the -805 SQL error code. This is one of those cases in which the words associated with a DB2 error code are a bit misleading. The error code text suggests that a -805 is always caused by a DBRM not being found in a plan, or a package not being found in a collection named in a plan's PKLIST. In fact, that is one -- not the only -- cause of a -805. If you look up the -805 in the DB2 for z/OS Codes manual, you'll see another reason for the error that's possible when the '02' reason code is indicated:

"The CURRENT PACKAGESET special register was not set correctly by the application."

I figured that was the problem, and might be so because the stored procedure in question was a native SQL procedure. How's that? Well, an external stored procedure's package can be bound into any collection via the COLLID option of the CREATE PROCEDURE statement. For a native SQL procedure, COLLID is not an option for the CREATE PROCEDURE statement. Where, then, does a native SQL procedure's package go? It goes into a collection that has the same name as the native SQL procedure's schema. The native SQL procedure ST_PROC_X referenced above was created with a high-level qualifier of HLQ_X (again, not the real identifier), so it's package was bound into collection HLQ_X. The stored procedure issued SET CURRENT PACKAGESET = 'COLL_A'. That would cause DB2 to look in collection COLL_A for package ST_PROC_X (same name as the stored procedure) to execute subsequent SQL statements issued by the stored procedure. Package ST_PROC_X was not in collection COLL_A, and that's why the -805 was issued, right?

Wrong on my part. The DBA told me that the ST_PROC_X package had, in fact, been bound into each of the database instance-aligned collections. OK, how had that been accomplished? Via BIND PACKAGE with the DEPLOY option, said the DBA.

And, BINGO, that was the problem. See, the DEPLOY option of BIND PACKAGE is typically used to migrate a native SQL procedure from one DB2 environment to another (e.g., from a test to a production environment). To get a native SQL procedure's package into various collections in the same DB2 environment as the package's "root" collection (i.e., the collection that has the same name as the native SQL procedure's schema), one should use BIND PACKAGE with the COPY option, not the DEPLOY option. When BIND PACKAGE was executed with DEPLOY by the DBA with whom I was working, new consistency tokens for the packages were generated by DB2; thus, the search for package ST_PROC_X in collection COLL_A failed at the consistency token level (the fourth-level qualifier of the fully qualified package name: location.collection.package.token). The DBA freed the packages that had been added to the database instance-aligned collections via BIND PACKAGE with DEPLOY, then copied the ST_PROC_X package from its root collection to the database instance-aligned collections using BIND PACKAGE with COPY, then called the stored procedure, and presto: everything worked like a champ.

So, SET CURRENT PACKAGESET is absolutely valid in a stored procedure (native or external) that is called by a DRDA requester. Just make sure that the stored procedure's package was placed in any collection named by SET CURRENT PACKAGESET, and put the stored procedure in that collection (or collections) the right way -- and that means via BIND PACKAGE with COPY when you're replicating a package across several collections within a given DB2 environment.

Friday, September 26, 2014

DB2 for z/OS: Avoiding zIIP Engine Contention Issues

In an entry posted a few months ago to this blog, I touched on several matters pertaining to mainframe zIIP engines and the use of these processors by DB2 for z/OS. Based on recent experience, I feel that it's important to further highlight a particular zIIP-related issue: zIIP engine contention. Through this post, I want to explain what zIIP contention is, why you want to avoid it, how you can spot it, and what you can do about it.

As, probably, most mainframers know by now, zIIPs (System z Integrated Information Processors) are "specialty engines" aimed at reducing the cost of mainframe computing. They do this in two ways: 1) they cost less than general-purpose engines, and 2) they do not factor into the pricing of mainframe software. Over time, as IBM has made more of the work done on System z servers zIIP-eligible, and as existing zIIP-eligible workloads have grown (particularly those associated with applications that access DB2 for z/OS data through network connections -- commonly referred to as DDF or DRDA workloads), zIIP engine utilization rates have climbed at many sites. That's a good thing, but only up to a certain point. If zIIP engines become too busy, DB2 performance can be negatively impacted. I'll first expand on that point.

What zIIP contention is, and why you want to avoid it

Consider a z/OS LPAR configured with one or more zIIPs. If a zIIP engine is not available when an item of zIIP-eligible work in the system is ready for dispatch (because the zIIPs are busy at that time with other work), that piece of work will be dispatched to a general-purpose engine. There is a slight delay involved in dispatching zIIP-eligible work to a general-purpose engine. That doesn't matter much if only a small amount of zIIP-eligible work ends up being redirected to general-purpose engines. If, however, the degree of what I call "zIIP spill-over" reaches a too-high level, application performance degradation can result.

Indicators of performance degradation caused by zIIP contention

zIIPs have been around for quite a few years now (since the early 2000s, I believe). Why is the issue of zIIP contention only now coming to the fore? Two reasons: 1) it took a while for zIIP utilization at many sites to reach a level at which contention can occur, and 2) the type of work that can utilize zIIP resources recently changed in an important way. This latter point refers to the introduction, with DB2 10 for z/OS, of zIIP eligibility for prefetch read and database write I/Os. Here's why that's important: it can be a pretty good chunk of work (often accounting for the large majority of CPU time charged to the DB2 database services address space, aka DBM1), and it's a type of system task that is very important for the performance of some DB2 applications. If zIIP contention causes delays in the dispatching of prefetch read tasks, run times for prefetch-intensive workloads (such as batch jobs and data warehousing/business intelligence applications) can become elongated. If that were to happen, you might see the effect as a significant increase in a class 3 wait time (so called because the information comes from records generated when DB2 accounting trace class 3 is active) that is labeled as suspension due to "other read I/O" in an IBM OMEGAMON for DB2 accounting long report (the field might be somewhat differently labeled in accounting reports generated by other vendors' DB2 monitor products).

Prefetch slowdown isn't the only thing to look out for. Lately, I've seen data that suggests a link between higher levels of "zIIP spill-over" and higher-than-desired in-DB2 not-accounted-for time for DB2-accessing applications. Not-accounted-for time is a DB2 class 2 time (it's captured in DB2 accounting trace class 2 records) that is probably calculated for you in an accounting long report generated by your DB2 monitor. If your monitor does not calculate this number for you, you can easily figure it yourself: just take average in-DB2 (i.e., class 2) elapsed time for an application, and subtract from that the average in-DB2 CPU time (keeping in mind that this is two numbers: in-DB2 general-purpose CPU time and in-DB2 specialty engine CPU time). Next, subtract out total class 3 suspend time. What's left is in-DB2 not-accounted-for time. Generally speaking, a not-accounted-for time that is less than 10% of total in-DB2 elapsed time for a higher-priority transactional workload (such as many DDF and CICS workloads) does not concern me (a higher level of not-accounted-for time for a batch workload is usually not such a big deal). In times past, a higher-than-desired in-DB2 not-accounted-for time (i.e., more than 10% of in-DB2 elapsed time for a higher-priority transactional workload) was usually an indication that a z/OS LPAR's general-purpose engines were over-committed (as might be the case if they are routinely running at utilization levels of 95% or more). Nowadays, it appears that higher-than-desired in-DB2 not-accounted-for time might also be caused by elevated levels of zIIP contention.

An early-warning indicator

Obviously, you'd like to spot and head off a potential zIIP contention-related performance problem. How could you do that? First and foremost, I'd keep an eye on what I call the zIIP spill-over rate. That's the percentage of zIIP-eligible work that ends up running on general-purpose engines (as mentioned previously, this spill-over occurs when zIIP engines are busy when zIIP-eligible work is ready for dispatch). Depending on the DB2 monitor that you use, this can be easily done. In an IBM OMEGAMON for DB2 accounting long report, you'll see these three class 1 CPU times for a workload (I've condensed them here -- they wouldn't appear one on top of the other in an actual report):

------------  ---------- 
CP CPU TIME     0.012106 

 SECP CPU       0.000166  (A)
SE CPU TIME     0.018054  (B)

The top-most number is the average general-purpose CPU time for the workload. The bottom number is the average zIIP engine CPU time. The middle number is the average amount of general-purpose CPU time consumed in doing zIIP-eligible work. If you label the second and third numbers A and B as I did above, the zIIP spill-over rate is calculated as follows:

A / (A + B)

The figures above show a zIIP spill-over rate of less than 1%. Such a small non-zero value would not cause me concern. What would cause me some concern is a rate that is over 5%. With that much zIIP-eligible work running on general-purpose engines, you could see the beginnings of a negative impact on application performance.

So, at what level of zIIP engine utilization would you see a potentially performance-impacting zIIP spill-over rate? That depends in large part on the number of zIIP engines configured in a z/OS LPAR. As a very rough rule of thumb, I'd say that for an LPAR with two or three zIIP engines, my preference would be to keep the zIIP engine utilization rate below 60%. You could probably go somewhat higher than that for an LPAR with a larger number of zIIP engines. If an LPAR has only a single zIIP engine, I'd want the utilization rate of the zIIP to be 50% or less.

Now, some of you may have heard that zIIP contention can occur at a zIIP utilization rate of about 30%. What that may mean is that zIIP spill-over has been observed on systems with zIIP utilization rates as low as 30% or so. First, I'm thinking that such observations may be associated with systems that have a single zIIP engine. Second, keep in mind that a small but non-zero zIIP spill-over rate may not materially affect application performance (as stated previously, I do not see a zIIP spill-over rate of less than 5% as being a cause for concern).

Doing something about zIIP contention

If you observe a higher-than-desired level of zIIP spill-over in one of your z/OS LPARs, what can you do about it? One obvious response would be to add zIIP capacity to the system (keep in mind that EC12 and BC12 servers can have up to two zIIPs for every general-purpose engine). If that is not feasible in the near term, see if you can take some steps to reduce zIIP utilization. An example of such a step would be a significant enlargement of one or more DB2 buffer pools (assuming that you have sufficient memory to back larger pools). This could reduce zIIP utilization by reducing synchronous reads associated with a DDF workload, and by reducing prefetch read activity (something that became zIIP-relevant starting with DB2 10). Using high-performance DBATs could reduce the overall CPU cost (and thus the zIIP consumption) of DDF-connected applications.

Summing it up

The moral of this story is pretty simple: do NOT think that you can run zIIPs as "hot" as you can general-purpose mainframe engines without encountering application performance issues. In doing capacity planning, anticipate growth in zIIP-eligible work on your z/OS systems (not only because of overall workload growth, but also because new types of work become zIIP-eligible through new releases -- and sometimes new maintenance levels -- of subsystem and operating system software), and aim to keep zIIP capacity ahead of zIIP demand so as to keep the rate of zIIP spill-over low. Calculate the zIIP spill-over rate for your production DB2 for z/OS systems, and track that number -- make it one of your key metrics for monitoring the performance of mainframe DB2 data servers.

Driving up zIIP utilization can be an effective way of enhancing the price performance of a z/OS system -- just don't take it so far as to be counterproductive.

Friday, August 29, 2014

DB2 10 for z/OS, 1 MB Page Frames, and the Number 6656

I recently received an interesting message from a mainframe DB2 person I've known for a long time. At his DB2 10 for z/OS site, 1 MB page frames had recently been made available on the System z servers, and my friend moved to leverage this resource via page-fixed DB2 buffer pools. Nothing unusual there. What's interesting is the information he saw in the output of -DISPLAY BUFFERPOOL commands subsequently entered for the page-fixed pools. I'll get to that strange-looking data in a moment, but first I'll provide a bit of background information for level-set purposes.

Fixing the buffers of a buffer pool in real storage so that they can't be paged out to auxiliary storage is an option that was introduced with DB2 V8 for z/OS -- I first blogged on the topic about six years ago, in an entry I posted to the blog that I maintained while working as an independent DB2 consultant (prior to rejoining IBM in 2010). In DB2 V8 and V9 environments, specifying PGFIX(YES) for a high-I/O buffer pool would improve a system's processing efficiency by reducing the CPU cost of I/O operations associated with the pool. That same benefit -- fewer CPU cycles consumed in reading and writing DB2 table space and index pages from and to disk (and from and to group buffer pools in a data sharing system) -- is realized as well in DB2 10 (and 11) systems when buffer pools are page-fixed, but another opportunity for CPU savings is added: starting with DB2 10, a buffer pool defined with PGFIX(YES) will be backed by 1 MB real storage page frames, versus traditional 4 KB frames, if 1 MB frames are available in the z/OS LPAR. The 1 MB page frames enhance performance by making translation of virtual storage addresses to real storage addresses more efficient (this because the larger frames lead to more "hits" in the translation lookaside buffer). How are 1 MB page frames made available in a z/OS LPAR? That's done by way of the LFAREA parameter in the IEASYSxx member of SYS1.PARMLIB (the ability to have some portion of a z/OS LPAR's memory resource managed in 1 MB frames was introduced with the IBM z10 mainframes and z/OS 1.10). With this exploitation by DB2 of 1 MB page frames, PGFIX(YES) can boost CPU efficiency even for buffer pools that have little in the way of I/O activity.

Back now to my friend the DB2 systems programmer. As I mentioned, he issued -DISPLAY BUFFERPOOL commands for his page-fixed pools to get some information on their allocation. Now, this wasn't the standard-issue -DISPLAY BUFFERPOOL command. What was issued was a particular (and not extensively documented) form of the command, shown below with BP0 used as an example:


When -DISPLAY BUFFERPOOL is used with that SERVICE(4) option in a DB2 10 environment (the story is different for DB2 11 -- I'll get to that later), the output will include DSNB999I messages showing the allocation of buffers in a pool with regard to the type of real storage frame backing the buffers (don't bother looking for DSNB999I in the DB2 10 Messages manual, because it's not there). For a 4K buffer pool with VPSIZE(9000) and PGFIX(YES), this is what my DB2 friend saw in the DSNB999I part of the -DISPLAY BUFFERPOOL output:

DSNB999I  -DBP1 4K PAGES 2344
DSNB999I  -DBP1 1M PAGES 6656

That 6656 number looked odd to the DB2 systems programmer, and to me, as well. Why would DB2 allocate 6656 of the pool's buffers with 1 MB page frames, and the other 2344 buffers (a little over 9 MB) with 4 KB frames? It wasn't because of a shortage of 1 MB page frames -- there were plenty of those available in the LPAR (if there are not enough 1 MB frames to fully back a PGFIX(YES) buffer pool, DB2 10 or 11 will use what 1 MB frames there are and then use 4 KB frames for the rest of the pool's buffers). My friend looked at -DISPLAY BUFFERPOOL output for his other page-fixed pools, and he saw, in each case, that the number of buffers allocated with 1 MB page frames was a multiple of 6656 (and this was true regardless of the size of a pool's buffers -- he saw the same thing for 4K and  32K pools). The buffers beyond the multiple of 6656 for a page-fixed pool were allocated with 4 KB frames. So, a 4K pool with VPSIZE(30000) and PGFIX(YES) was seen to have 26,624 buffers (4 X 6656) allocated with 1 MB frames and 3376 buffers allocated with 4 KB frames. A 32K pool with VPSIZE(10000) and PGFIX(YES) had 6656 buffers allocated with 1 MB frames and 3344 buffers allocated with 4 KB frames.

In researching this strange-looking situation, the DB2 systems programmer and I came across a relatively recent APAR, PI12512, and that (plus a little more digging on my part) gave us our answer. DB2 uses control blocks to track buffers in a pool, and DB2 10 initially allocated control blocks for a PGFIX(YES) pool in such a way that it made sense to allocate the control blocks in 1 MB chunks and to back with 1 MB page frames the number of buffers in the pool covered by these 1 MB chunks of control blocks. It so happens that 1 MB of buffer pool control blocks can cover 6656 buffers, and that's why buffers in a PGFIX(YES) pool were allocated in multiples of 6656 with 1 MB page frames, with any buffers left over (i.e., any beyond the largest multiple of 6656 that was less than a pool's VPSIZE) backed by 4 KB frames.

When the fix for APAR PI12512 is applied to a DB2 10 subsystem, things work differently: buffers in a PGFIX(YES) pool are allocated in 1 MB chunks with 1 MB page frames, and any buffers left over, after as many 1 MB frames as possible have been filled, are allocated to 4 KB frames. The text of APAR PI12512 points up one advantage of this new arrangement: page-fixed buffer pools with a VPSIZE value of less than 6656 can be backed by 1 MB page frames. That is indeed a good thing, but there is a benefit for larger pools, as well. That benefit is best illustrated by example. Recall that my DB2 systems programmer friend saw, for a 4K pool with VPSIZE(9000) and PGFIX(YES), that 6656 buffers (about 74% of the total) were allocated with 1 MB page frames and 2344 buffers were allocated with 4 KB frames. If the fix for APAR PI12512 were applied to the DB2 system (and if the pool were deallocated and reallocated, which of course would be a consequence of the DB2 stop and restart required to activate the maintenance), for that same buffer pool one would expect to see 8960 buffers (99.5% of the total -- 35 chunks of 256 four kilobyte buffers) allocated with 1 MB frames, leaving only 40 buffers to be allocated with 4 KB frames. The greater the percentage of a page-fixed pool's buffers that can be allocated with 1 MB page frames, the better, in terms of performance, because the efficiency gained through decreased translation lookaside buffer misses will be maximized.

I should point out here that the change in the allocation of control blocks for page-fixed buffer pools that made possible the 1 MB page frame benefit maximization delivered by APAR PI12512 was accomplished through the fix for another APAR, PM85944 (and that fix was ultimately provided via APAR PM90486).

OK, now for the DB2 11 perspective. APAR PI12512 is a DB2 10-only APAR (and that's also the case for APARs PM85944 and PM90486). DB2 11 already has the functionality that these APARs provided for DB2 10; furthermore, with DB2 11 you don't have to issue -DISPLAY BUFFERPOOL with that secret-code-looking SERVICE(4) option to see the allocation of a pool's buffers with regard to different page frame sizes. And, with DB2 11 you won't see DSNB999I (the message you can't find in the DB2 for z/OS Messages manual) in -DISPLAY BUFFERPOOL output. Issue the simple and familiar command shown below in a DB2 11 environment (and here I'm using BP1 as an example):


and you'll see that the output includes something that looks like this:


The "preferred frame size" indicates what was specified for the FRAMESIZE option of -ALTER BUFFERPOOL for the pool. That's a new option introduced with DB2 11 for z/OS. For a PGFIX(NO) pool, DB2 will automatically allocate the pool's buffers with 4 KB page frames. If there are 1 MB page frames available in a z/OS LPAR, and PGFIX(YES) is specified for a buffer pool, DB2 will automatically prefer 1 MB page frames for the pool (even so, in that case you might want to consider an explicit specification of FRAMESIZE(1M), just to be intentional about it). You can't use 2 GB frames for a buffer pool (new with DB2 11) unless the LFAREA specification provides some 2 GB frames in the z/OS LPAR, and the pool is defined with PGFIX(YES), and DB2 is running in a zEC12 server (you probably wouldn't see much, if any, performance benefit with a FRAMESIZE(2G) specification for a pool -- that configuration option is likely to be more advantageous when z/OS LPAR memory sizes are considerably larger than what is typically seen these days).

Below the "preferred frame size" line of the DSNB546I message (and that message can be found in the DB2 11 for z/OS Messages manual), you see information on the actual allocation of a pool's buffers with respect to different page frame sizes. There is one DSNB546I message for each different page frame size used in the allocation of a pool's buffers (in my example above, the buffer pool had a VPSIZE of 4000, and all of the pool's buffers were allocated with 4 KB page frames).

There you have it. That 6656 number (or a multiple thereof) will continue to show up in -DISPLAY BUFFERPOOL(BPn) SERVICE(4) output in DB2 10 for z/OS systems (for page-fixed pools in z/OS LPARs that have 1 MB page frames), until the fix for APAR PI12512 is applied. It won't show up in DB2 11 environments. Someday, we DB2 old-timers will reminisce about the strange days of buffers being allocated with 1 MB page frames in 6656-buffer chunks, same as we look back now on old and odd-seeming things like index subpages and hiperpools in expanded storage. And young whippersnappers will roll their eyes.

Monday, August 25, 2014

DB2 for z/OS: Can a Pooled DBAT Hold Locks?

The answer to the question in the title of this post is, "No, it can't -- unless it can." How's that for a hook?

OK, a little background. A DBAT is a database access thread. It's the kind of thread that a network-attached application process uses to interact with a DB2 for z/OS system (such a process, also referred to as a DRDA requester, issues SQL statements that get to DB2 by way of the distributed data facility, aka DDF). When a transaction using a "regular" DBAT completes, the DBAT goes into a pool so that it can be used in the execution of another transaction (a high-performance DBAT does not go into the DBAT pool upon transaction completion; instead, it remains dedicated to the connection through which it was instantiated until it has been reused 200 times). 

Last month, a DB2 for z/OS DBA asked me if a pooled DBAT can hold locks. Why did he ask that question? Because of some wording he saw in the DB2 for z/OS Installation and Migration Guide. The text to which he referred, in the description of the ZPARM parameter POOLINAC, is as follows (emphasis added by me):

"A database access thread in the pool counts as an active thread against MAX
REMOTE ACTIVE and can hold locks."

That "and can hold locks" bit looked funny to me. How could a pooled DBAT hold locks when a prerequisite for going into the pool is a "clean commit" of the transaction using the DBAT -- something that implies, among other things, a release of all locks owned by the thread? [Again, we're talking about "regular" -- not high-performance -- DBATs here, and that means RELEASE(COMMIT) packages, and that means all locks are released by a clean commit.]

I ran this by a colleague in the DB2 for z/OS development organization, and he said that the "and can hold locks" wording in the Installation and Migration Guide is probably outdated -- a leftover from the days, relatively early in the life of DB2 for z/OS V6, when RELEASE(DEALLOCATE) was honored for packages executed via DBATs. Of course, with that behavior in effect, a transaction using a DBAT could end with a "clean commit" (i.e., with no "hanging resources" such as WITH HOLD cursors, not-dropped declared global temporary tables, and held LOB locators) -- thereby allowing the DBAT to be pooled -- and still the thread could hold table space locks because those are retained until thread deallocation when RELEASE(DEALLOCATE) is in effect for a package. In fact, this aspect of RELEASE(DEALLOCATE) is a key reason why a DB2 V6 APAR was created to make DB2 stop honoring that bind option for packages executed via DBATs: DBATs can last a long time, and having table space locks retained for such long periods of time could lead to contention problems in the system. With the fix to this DB2 V6 APAR applied, DB2 treated a RELEASE(DEALLOCATE) package executed via a DBAT as though it had been bound with RELEASE(COMMIT).

So, I was right: a pooled DBAT can't hold locks, because only non-high-performance DBATs can be pooled, and non-high-performance DBATs are associated only with RELEASE(COMMIT) packages, and with a RELEASE(COMMIT) package you get a release of all locks when a transaction completes with a clean commit (something that has to happen before a DBAT can be pooled).

Here's the deal, though: I was right only briefly. Just a short time ago, a fix for DB2 APAR PI20352 came out, and with that fix applied in a DB2 10 or 11 for z/OS system, a pooled DBAT can hold locks. How so? Pretty simple: PI20352 allows a high-performance DBAT to be pooled, and a high-performance DBAT can hold locks (specifically, table space locks) after a transaction using the thread completes with a clean commit, because that's what happens for a RELEASE(DEALLOCATE) package, and a high-performance DBAT will be associated with at least one RELEASE(DEALLOCATE) package -- instantiation of a high-performance DBAT depends on a DBAT being used in the execution of such a package.

But why would you want a high-performance DBAT to be pooled? What's wrong with the original high-performance DBAT behavior, which eschews pooling in favor of staying dedicated to the instantiating connection for 200 units of work? Well, that behavior is fine when DDF-using applications stay connected to a DB2 for z/OS system for long periods of time, but it can be sub-optimal for environments in which applications stay connected to DB2 for shorter periods of time. That situation means more disconnect and reconnect activity. Prior to APAR PI20352, if an application connected to DB2 and executed a RELEASE(DEALLOCATE) package -- thereby instantiating a high-performance DBAT -- and then disconnected before the high-performance DBAT had been reused 200 times, the high-performance DBAT would have to be terminated because it couldn't be pooled. That would end up causing a good bit of DBAT create and terminate activity, and that is CPU overhead you'd rather avoid.

With the fix for APAR PI20352 applied, when an application using a high-performance DBAT disconnects from DB2 before the thread has been used for 200 units of work, the high-performance DBAT will be pooled rather than terminated. The pooled high-performance DBAT can then be used by any DDF connection needing a thread to process a request. Once it's used to service a request from a connection, the high-performance DBAT that had been pooled will do as high-performance DBATs do: it will stay dedicated to that connection so that it can be reused (a CPU efficiency booster when paired with RELEASE(DEALLOCATE) packages). Note that a particular high-performance DBAT can be used by a maximum of 200 units of work. That re-use limit is not reset when a thread is pooled, so if a high performance DBAT has been used 50 times and the associated application disconnects from DB2, resulting in the pooling of the DBAT, it can be reused 150 more times before being terminated (that termination, which is normal for high-performance DBATs, is done to free up resources allocated to the DBAT).

This is good stuff, but you might be thinking of a potential fly in the ointment: what if POOLINAC (the ZPARM that specifies the amount of time that a pooled DBAT can sit around waiting to be used -- exceeding that time results in DBAT termination) at a DB2 site has been set to zero, which would disable termination of pooled DBATs due to inactivity? Couldn't that lead to pooled high-performance DBATs sitting around indefinitely, holding table space locks (and holding RELEASE(DEALLOCATE) packages)? As it turns out, this is not something about which you need to be concerned. The DB2 developers anticipated that situation, and when the fix for APAR PI20352 is applied, a POOLINAC setting of 0 will be changed to the default POOLINAC value of 120 seconds.

So, to return to the question in this post's title: can a pooled DBAT hold locks? The answer depends on whether or not you've applied to fix to APAR PI20352 to your system. If you haven't applied that fix, the answer to the question is, "no." If you have applied that fix, the answer is, "yes," but it's a qualified yes. Only a pooled high-performance DBAT can hold locks; and these will only be table space-level locks (which are almost always of the non-exclusive "intent" variety), versus page or row locks. Keep in mind that it's good that a high-performance DBAT can be pooled if the application using it disconnects from DB2 before the thread has been used by 200 transactions (this cuts down on DBAT termination and creation activity that would otherwise occur). And remember the POOLINAC reset that will change a specification of 0 to the default of 120 seconds -- a nice enhancement that will keep people from shooting themselves in the foot.

Pooled DBATs have long been a mainstay of DB2 client-server efficiency and scalability. With APAR PI20352, high-performance DBATs get their invitation to the DBAT pool party. Seems appropriate that this was a summertime APAR (speaking from a northern hemisphere perspective).