Friday, April 27, 2012

DB2 for z/OS: Trading Memory for MIPS (Part 3)

In part 1 of this three-part series on leveraging Big Memory (i.e., a large z/OS LPAR real storage resource) to enhance DB2 workload CPU efficiency, I focused on getting the most out of a DB2 buffer pool configuration without increasing its size (this by adjusting certain threshold settings and redistributing buffers between low-I/O and high-I/O pools). In part 2, I offered guidance on being bold -- but not reckless -- in asking for more memory for a DB2 subsystem's buffer pool configuration. In this, the concluding entry of the series, I'll look at memory-for-MIPS moves that involve DB2 pools other than the buffer pools -- starting with the EDM pool.

Actually, that "starting with the EDM pool" bit is an "it depends" thing, depending on the version of DB2 for z/OS you're using. What I want to talk about here is improving CPU efficiency via the use of the RELEASE(DEALLOCATE) parameter of the BIND (and REBIND) PACKAGE command. RELEASE(DEALLOCATE), when used for an SQL-issuing program that is executed via a thread that persists through commits, saves CPU cycles by causing DB2 to retain certain items associated with package execution, such as table space locks and thread copies of package sections, until thread deallocation, as opposed to releasing these resources at each commit (and very likely having to re-acquire them as the application process continues to execute). Threads that persist through commits include CICS-DB2 protected entry threads, threads used by batch jobs, and -- starting with DB2 10 -- high-performance database access threads used for DRDA clients (I blogged about high-performance DBATs last year). The less SQL work an application process performs between commits, the greater the proportional CPU cost of at-commit resource release and re-acquisition that occurs when RELEASE(COMMIT) is in effect. It is for such application processes -- high volume, relatively simple transactions, and batch jobs that issue lots of commits -- that RELEASE(DEALLOCATE) can deliver significant CPU savings (sometimes on the order of 10%). There are operational considerations associated with RELEASE(DEALLOCATE) and persistent threads. These include utility execution scheduling (a utility might not be able to execute if an application process holds a lock -- even a non-exclusive, intent-type lock -- on the target table space, and with RELEASE(DEALLOCATE) that table space -level lock will be held until thread deallocation) and package bind and DDL activity scheduling (same reason -- you could end up having to wait for deallocation of persistent threads used for the execution of packages bound with RELEASE(DEALLOCATE)). Keep these considerations in mind. It's likely that you can deal with them -- lots of folks have (and the PKGREL option of the -MODIFY DDF command can help in this regard when you use high-performance DBATs).

Another operational consideration associated with the use (or increased use) of the RELEASE(DEALLOCATE) bind option has to do with increased virtual and real storage utilization, and that's where the "it depends" at the start of the preceding paragraph comes in. There's no "it depends" with respect to increased memory utilization -- that WILL happen when you use RELEASE(DEALLOCATE) with persistent threads. The dependency is on WHERE that increased memory utilization happens in the DB2 database services address space (aka DBM1). See, one of the thread-allocated resources that DB2 will retain until thread deallocation when RELEASE(DEALLOCATE) is in effect for a package is the copies of package sections associated with the SQL statements issued by the program. Here's how that works: when a program issues an SQL statement, the section of the associated package containing the bound statement (if it's static) will be copied from the "skeleton" part of the EDM pool to another part of DBM1 (if it's a dynamic SQL statement, the relevant package could be, for example, one of the IBM Data Server Driver packages used for dynamic SQL statements issued by network-attached client-server applications via JDBC calls). WHERE in DBM1 will that thread copy of the package be stored? THAT'S the version-dependent part of this equation. Prior to DB2 10, the thread's copy of a package section (or a plan section, in the case of a DBRM bound directly into a plan -- something that is not supported in a DB2 10 environment) was stored in a part of the EDM pool. With DB2 10, the package section allocated to a thread is copied to an agent local pool that is exclusive to the thread in question (and there's some plan-related thread storage, too, even in a DB2 10 system -- it's mostly package storage, but not all). This DB2 10 change is a good thing, in that it eliminates latching that occurred when EDM pool space was used for copies of packages allocated to threads. It also moved this storage above the 2 GB bar for packages bound in a DB2 10 system, and THAT gives you a WHOLE lot more virtual storage head room for using RELEASE(DEALLOCATE) to reduce the CPU cost of DB2-accessing programs. [In a DB2 Version 8 system, none of this virtual storage used for thread copies of package and plan sections went above the 2 GB bar, and with DB2 9 only some of it went above the 2 GB bar.]

The bottom line, then, is that RELEASE(DEALLOCATE) can do you more good, in terms of boosting CPU efficiency, in a DB2 10 environment because you almost certainly will be able to use this bind option more extensively than before. Now, this doesn't mean that you can or should bind ALL of your packages with RELEASE(DEALLOCATE) in a DB2 10 system, because real storage resources, though potentially extensive, or nonetheless finite. If your DB2 monitor supports DB2 10, it will report to you (in a statistics detail report or an online display) the amount of storage -- above and below the 2 GB bar -- used for plans and packages (these are copies associated with threads -- storage used for skeleton package and cursor table purposes is still in the EDM pool, and is reported as such). Keep an eye on that storage utilization, and of course keep an eye on the demand paging rate for the z/OS LPAR in which the DB2 subsystem is running: if that's in the single digits or less per second, on average, during busy periods of the processing day, real storage is not being pressured to an undesirable extent. In a DB2 9 or Version 8 system, keep an eye on the EDM pool storage used for PT and CT, and consider enlarging that part of the pool (or cutting back on your use of RELEASE(DEALLOCATE)) if the number of free pages there is indicated to be less than 10% of the total number of pages in that part of the pool (and that's actually two parts in a DB2 9 environment -- one above the 2 GB bar and one below).

The other EDM-pool related memory-for-MIPS play has to do with dynamic statement caching. With respect to the global (versus the local) statement cache, this has been above-the-bar storage since DB2 V8 (the global statement cache, allocated when the ZPARM parameter CACHEDYN is set to YES, is very commonly used; local statement caching, associated with the KEEPDYAMIC(YES) package bind option, is less commonly used). Your DB2 monitor will show you the hit ratio for the dynamic statement cache. The larger this section of the EDM pool, the greater the hit ratio is likely to be, and the larger that hit ratio is the more you save through avoidance of full prepares of dynamic SQL statements. With this said, there's probably a point of diminishing returns that you could hit in your system with regard to enlarging the dynamic statement cache (accomplished by updating the EDMSTMTC parameter in ZPARM). If the hit ratio for the dynamic statement cache is north of 90% in your environment, making the cache larger may not move the needle much further to the right.

Turning now the RID pool, used by DB2 for RID list processing (done for things like index ANDing and ORing, hybrid join, and list prefetch): what you want is for as many of these operations as practically possible to complete in memory, and a larger RID pool will help in accomplishing that objective. Your DB2 monitor will show you the number of times that RID processing operations were terminated because of a lack of storage for such processing, and a RID pool enlargement can be accomplished by increasing the value of the MAXRBLK parameter in ZPARM (the RID pool has been allocated in above-the-bar virtual storage since DB2 V8). DB2 10 delivered a couple of important changes related to RID list processing: first, the default size of the RID pool in a DB2 10 system is 400 MB, versus 8 MB with DB2 9 and DB2 V8 (this is one of a number of reasons why you can expect DB2 real storage utilization to increase when you migrate to DB2 10). Important change number 2 concerns what happens when RID pool space is insufficient for the completion of a RID list processing operation. Prior to DB2 10, such a storage-related RID list processing failure would cause DB2 to abandon RID list processing for the SQL statement in question and go instead with a table space scan for the object being accessed. With DB2 10, insufficient RID pool space for a RID list processing operation causes DB2 to keep on trucking with the operation, using 32K work file space for the RID list. That's likely to be a performance-benefiting approach versus abandoning RID list processing for a table space scan, and even though processing a RID list using work file space is not quite as CPU-efficient as getting it all done in the RID pool, the overhead of overflow is mitigated if you have a large buffer pool dedicated to your 32K work file table spaces (fewer I/Os means less CPU consumption). Note that the new DB2 10 ZPARM parameter MAXTEMPS_RID can be used to limit the amount of space in the work file database that DB2 can use for a single RID list processing operation.

Lastly, a word about the DB2 sort pool. This is space in memory (above the bar since DB2 V8) that DB2 uses for SQL-related (versus utility-related) sort operations. The larger the sort pool, the more CPU-efficient DB2 SQL-related sorts tend to be. If you're considering a sort pool enlargement, keep a couple of things in mind: first, this pool (sized via the SRTPOOL parameter in ZPARM), is the maximum size of the sort work area that DB2 will allocate for each concurrent sort user (so a sort pool size of X MB could result in DB2 using 20X MB of virtual storage for sort pool space if there are 20 concurrent sort operations in progress). Second, the default size of the sort pool went from 2 MB in DB2 V8 and DB2 9 systems to 10 MB with DB2 10 (again, one of multiple reasons for DB2 10's greater use of memory versus DB2 9 and DB2 V8). If you want to increase the size of the sort pool on your system (the upper limit is 128 MB, but the largest value I've seen on a DB2 system is 30 MB), keep that per concurrent sort item in mind, and as always keep your eye (perhaps with the help of a z/OS systems programmer) on the z/OS LPAR's demand paging rate. I'm fine with a rate in the low single digits per second or less. A demand paging rate in the higher single digits per second is yellow-light territory in my book, and a rate in excess of 10 per second would tell me that the z/OS LPAR's memory resource may be over-taxed, and I might take steps to reduce DB2's use of that resource (via buffer pool, EDM pool, RID pool, or sort pool adjustments, or some combination of these).

Modern mainframe systems tend to have a lot of real storage, and the size of that resource for the average z/OS LPAR keeps getting larger, based on what I'm seeing. I'd like for you to use System z memory to make your DB2 workload more CPU-efficient, and I hope that the information in this three-part blog entry will help you to do that.

Thursday, April 12, 2012

DB2 for z/OS: Trading Memory for MIPS (Part 2)

In part one of this three-part entry on leveraging System z memory to reduce CPU consumption for a DB2-accessing application workload, I wrote about getting your DB2 buffer pool configuration house in order by doing things to improve CPU efficiency without increasing the total amount of virtual storage allocated for page caching (e.g., adjusting certain thresholds for work file-dedicated pools, making appropriate use of buffer page-fixing, choosing the right buffer-steal algorithm for a pool, and judiciously shifting buffers from low-I/O to high-I/O pools). After taking these actions you might find that you still have high read I/O rates for some of your buffer pools. That's when you'd take the next step (if feasible, and I'll get to that momentarily) and increase the size of your buffer pool configuration to drive down disk read activity and associated CPU consumption. In a nutshell, that is the subject of this part two post. Within the next couple of weeks I'll post my part three entry, in which I'll look at memory-for-MIPS actions that involve DB2 virtual storage pools other than the buffer pools (e.g., the EDM pool, the RID pool, and the sort pool).

Before going further, I'd like to address a topic that's come up a number of times in conversations I've had with DB2 for z/OS DBAs and systems programmers -- that being the difference between GETPAGE requests and DB2 read I/O requests. Plenty of people have the idea that a bigger buffer pool means fewer GETPAGEs. In fact, that is generally NOT the case. Driving down GETPAGEs is a worthwhile objective, since GETPAGE requests -- more than anything else, in my experience -- are the key determinant of the CPU cost of SQL statement execution; however, growing your buffer pool configuration is not the way to reduce GETPAGE activity. Instead, you'd attack that objective through means such as physical database design changes (adjusting table clustering, adding or modifying indexes, etc.) and SQL statement coding (e.g., changing a non-indexable predicate to an indexable predicate). Enlarging your buffer pool configuration is about reducing I/O requests (primarily read requests, which tend to greatly outnumber write I/O requests) and the CPU cycles consumed in driving those requests. Is there a connection between GETPAGE requests and read I/O activity for a buffer pool? Of course there is: when DB2 needs to examine a page from a table space or an index -- usually on behalf of an application program -- it will issue a GETPAGE request for that page. If the page in question is NOT already in the buffer pool to which the table space or index is assigned, the GETPAGE request will result in DB2 initiating a read I/O operation to bring the page into memory (unless the page is to be read into memory via an already-underway prefetch I/O, in which case the requesting program will wait for the completion said prefetch I/O). A bigger buffer pool, then, doesn't mean fewer GETPAGEs, as DB2 will still need to look at the same number of table and/or index pages in executing the same SQL statement (all other things being equal). What a bigger buffer pool DOES mean is that fewer of the GETPAGE requests will result in read I/O operations, since the likelihood of a requested page already being in memory will tend to increase as a buffer pool's size increases.

There is an exception to this rule: if a buffer pool is way too small relative to the number of pages belonging to objects assigned to the pool and the volume of requests for those pages, DB2 could end up hitting the data manager threshold for the pool. That threshold, also known by the acronym DMTH, is reached when 95% of a pool's buffers are non-stealable (either currently in use or updated and not yet externalized to disk or to a coupling facility in a DB2 data sharing system). When that happens, DB2 will issue a GETPAGE for every row retrieved from a page, so that, for a page from which 20 rows are retrieved for a program, 20 GETPAGE requests will be issued, versus one GETPAGE had DMTH not been hit for the pool. Obviously, a big spike in GETPAGE activity will cause a spike in CPU consumption, so keep an eye on DMTH and make sure that you're not hitting that threshold for any of your buffer pools (your DB2 monitor will provide this information, as will the output of the DB2 command -DISPLAY BUFFERPOOL(xxx) DETAIL, where "xxx" could be the name of an individual pool, such as BP4, or the word ACTIVE if you want to see information for all active buffer pools). If DMTH is being hit for a pool, it's likely that the pool is significantly undersized.

On now to making a buffer pool configuration larger. While there's no need to apologize for that (I'll expand on this point in a moment), you want to do it for a reason, and the reason, as just pointed out, is to reduce disk read I/O activity (which in turn reduces CPU consumption). So, your first order of business is to check on the rate of read I/O activity for each of your buffer pools. As pointed out near the end of my part one memory-for-MIPS post, what you're looking for is the TOTAL rate of read I/O requests (synchronous and asynchronous) per second for your buffer pools. Also as pointed out in the part one post, you can get that figure from a DB2 monitor statistics detail report or online display of buffer pool activity, or from the output of the DB2 command -DISPLAY BUFFERPOOL DETAIL (and I describe in the part one post how to use -DISPLAY BUFFERPOOL DETAIL output to get the desired information). If the total read I/O rate for a pool is less than 100 per second, making that pool larger won't hurt anything (assuming the presence of sufficient real storage to back that enlargement, about which I'll have something to say in a bit), but it likely won't move the CPU consumption needle much (though you might want a zero-read-I/O situation associated with "pinning" a DB2 object in memory -- something made more straightforward and effective in a DB2 10 environment via the new PGSTEAL(NONE) buffer pool option, described in the part one post). What should draw your attention are any pools for which the read I/O rate is in the hundreds -- or even thousands -- per second (the highest I've seen with my own eyes at a DB2 for z/OS site is 9000 read I/Os per second for a pool). Taking a big chunk out of these numbers through buffer pool enlargement can result in a significant improvement in CPU efficiency for your DB2 workload.

Now, taking a "big chunk" out of a high rate of disk read I/Os is not likely to happen via itty-bitty steps, by which I mean that you probably won't see much of an impact from adding 1000 buffers to an 80,000-buffer pool. Big-step buffer pool enlargements will often involve increasing a pool's size by 25% or even 50% (a pool that's currently quite small -- say, 10,000 4K buffers in a system with a lot of real storage -- might be doubled in size -- or more -- in a single step, depending on the particular circumstances of the situation). I tend to think big, in terms of buffer pool enlargement, because I so often see buffer pools with high read I/O rates that are way smaller than they could be, given the abundance of memory available on the associated z/OS LPAR. Here's the deal: since 64-bit addressing was introduced in the System z environment some years ago, and the cost of memory was reduced for new mainframe models, it's become more and more common for production z/OS LPARs to be configured with 20 or 40 GB or more of real storage (and sometimes you see a z/OS LPAR with north of 100 GB of real storage -- a z196 server can be configured with up to 3 TB of memory); yet, in these big lakes of memory you find production DB2 subsystems with a total aggregate buffer pool size (meaning, the sum of the sizes of each allocated pool) of a measly 1 GB -- or even less. HELLO! IS ANYONE UNDER THE IMPRESSION THAT YOU GET BONUS POINTS FOR NOT USING A HARDWARE RESOURCE -- MAINFRAME MEMORY -- FOR WHICH YOUR ORGANIZATION PAID MONEY? Come on, folks. That stuff is there to enable your system to perform better, and no subsystem will use more memory to better effect than DB2.

That brings us to the "don't apologize" part. Some DB2 for z/OS people -- especially those who've worked with DB2 for 15 or 20 years or more -- seem to have some kind of a guilt complex over their subsystem's use of mainframe memory resources. "Golly, I'm using a GIGABYTE just for my buffer pools. The other subsystems on the production LPAR don't use that much memory. I should be satisfied with what I have." WRONG. Other z/OS subsystems that handle things like transaction management and security and networking don't NEED the memory that a relational database management system needs, and -- more importantly -- they don't benefit from getting lots more memory the way DB2 does. DB2 for z/OS was architected, from the get-go, to perform better with a large real storage resource. Give DB2 more memory, and -- unless your system is storage-constrained, and I'll get to that -- throughput goes up and CPU consumption (most likely) goes down (though if more memory for DB2 lets you get a lot more work done in a unit of time, CPU utilization could increase, but that's usually OK if it means that, for example, a critical batch run completes in much less time than before). Some DB2 sites -- and more as time goes on -- have figured this out. They've leveraged big mainframe memory resources to greatly increase DB2 buffer pool sizes, with very positive effects on application performance. The biggest buffer pool configuration I've seen for a single DB2 subsystem is approximately 42 GB. Are the people at that shop crazy? Of course not. They have a z/OS system with a large real storage resource (way bigger than 42 GB), and they are taking advantage of that to deliver excellent performance for a high-volume, mission-critical application that accesses a very large database. They key is to go big when there's big memory available, without going too big.

And that's the rub, you might think. How big is too big, in terms of mainframe memory allocated to a DB2 subsystem? Actually, that's not too hard to monitor. The main thing here is to avoid a situation in which mainframe memory is over-committed, and the metric to keep an eye on in that regard is the z/OS LPAR's demand paging rate. This figure, available via a z/OS monitor, is the rate (usually reported as a per-second number) at which z/OS has to bring into real storage, on-demand, pages that were paged out to auxiliary storage (often referred to as page data sets, on disk). What should that rate be? I'll tell you one thing: it shouldn't be zero -- not in my book. If the z/OS system's demand paging rate is zero, that tells me you have an underutilized resource that could be used to improve the performance of your DB2 workload, and you can change that "underutilized" situation by enlarging buffer pools that have high read I/O rates (or by pinning some really performance-critical DB2 objects in memory). Now, while I think that a demand paging rate that's greater than zero is generally a good thing, you don't want that number to go too high. I'm personally very comfortable with a demand paging rate that's between 1 and 10 per second during busy times (i.e., times of high volume workload processing). Low single-digits I like more than high-single digits, but I wouldn't be concerned about the memory resource being under more-than-desired pressure unless the demand paging rate gets into 10+ per second territory (and here I'm not counting the occasional memory-intensive event, such as a dump, that could cause the demand paging rate to temporarily spike to a higher level). Should your aim be a demand paging rate of 10 per second? No. I'm just saying that a rate of less than 1 per second (quite common these days) means that the system's real storage resource is not under any kind of significant pressure, and there is "head room" available for increasing virtual storage usage -- through enlarging buffer pools, for example -- without negatively impacting non-DB2 work on the system.

So, if the demand paging rate on your system is less than 1 per second and you have high read I/O buffer pools and/or you want to pin some DB2 objects in memory, what should you do? How big should you go, buffer pool-wise? Here's a rule of thumb that's worked for me: in my experience, when a single production DB2 subsystem is running on a z/OS LPAR, that LPAR's memory resource is very unlikely to be under an undesirable level of pressure if the DB2 subsystem's buffer pool configuration size is up to 25% of the z/OS LPAR's real storage size. So, if you have a z/OS LPAR with 20 GB of real storage, a production DB2 subsystem buffer pool configuration size of 5 GB (again, meaning the aggregate size of all of the subsystem's active buffer pools) should pose no problem in terms of leaving plenty of memory for address spaces other than DB2's, such as those associated with TSO, CICS, batch, RACF, etc. (keep in mind that when the maximum real storage size was 2 GB, not too many years ago, it was not uncommon for sites to use 50% or even 75% of that for DB2 buffer pools). If you're well south of that 25% level, you should be able to grow your DB2 buffer pool configuration considerably without hurting overall system performance -- just keep an eye, as I've said, on that demand paging rate, and see that it stays in the aforementioned single digits per second range. Already have a DB2 buffer pool configuration size that's 25% (or even more) of the z/OS LPAR's real storage size? That doesn't mean you have to stop there. Just be deliberate about further buffer pool size increases, and don't overextend the memory resource of the system. Also, don't over-do it with respect to the use of PGFIX(YES) for your buffer pools, as I noted in my part one memory-for-MIPS post. And one more thing: if you have more than one production DB2 subsystem on an LPAR (test and development subsystems usually have buffer pool configurations that are smaller than that of a production subsystem), be aware of the TOTAL buffer pool configuration size of ALL the subsystems. Taking this to an extreme, if you have 4 production DB2 subsystems on one z/OS LPAR, don't size the buffer pool configuration of each to be 25% of the LPAR's memory -- that would likely result in a demand paging problem. You might want the TOTAL size of ALL of the subsystem's buffer pool configurations to be no more than half of the amount of real storage assigned to the LPAR.

If you make a buffer pool larger, check that read I/O rate again. If it's gone down, good. If it hasn't, you might want to make the pool larger still, but if you do that and the read I/O rate still doesn't budge, you may want to think in terms of taking some of the more performance-critical objects in that pool and assigning them to a new pool.

Finally, if you use DB2 in data sharing mode, keep in mind that enlarging a local pool on the member subsystems may necessitate an enlargement of the corresponding group buffer pool (GBP) in the coupling facility. What you want to do here is keep the GBP large enough, relative to the aggregate size of the corresponding local pools, to prevent page invalidations due to directory entry reclaims, as such invalidations require member DB2 subsystems to re-read into memory pages that were likely "clean" when invalidated, and that's a drag on system performance (output from the DB2 command -DISPLAY GROUPBUFFERPOOL GDETAIL will show you whether such invalidations are occurring or not -- look for the field labeled "CROSS INVALIDATIONS DUE TO DIRECTORY RECLAIMS"). To properly size GBPs in this regard, you can make use of the IBM CFSizer tool, available on the Web at http://www-947.ibm.com/systems/support/z/cfsizer/index.html. There's also a quick sizing formula for 4K group buffer pools with the default 5:1 ratio of directory entries to data entries that has long worked well for me (I used it effectively at a DB2 data sharing site just last month): add up the sizes of the local pools and divide that by three, and you have a good GBP size. For example: if you have two member subsystems and BP4 has 15,000 buffers, or 60 MB, on each member, a GBP4 sized at (60 MB + 60 MB) / 3 = 40 MB should not cause page invalidations due to directory entry reclaims (I worked out that formula years ago for 4K GBPs, and haven't worked on a similar one for 8K or 16K or 32K GBPs, but I could do that for those GBP types without too much difficulty).

Whew. More words than usual for one of my blog entries, but I was on a roll and went with it. Hope the information is useful to you. As previously mentioned, I'll post part three of this three-part memory-for-MIPS entry within the next couple of weeks.