Tuesday, September 29, 2015

An Oft-Overlooked DB2 for z/OS Big Memory Exploitation Play

Over the past few years, I've done a lot of writing and presenting and talking about ways in which large-scale z Systems memory resources (which I refer to as Big Memory) can be leveraged so as to enhance the performance of DB2 for z/OS subsystems and related applications. While I very much like to see z/OS LPARs with memory sizes of 100 GB or more, I DON'T like to see situations in which organizations stop well short of exploiting Big Memory to maximum positive effect with regard to DB2 performance. Big buffer pools are a great way to exploit available real storage gigabytes, but too often DB2 people don't look much beyond that particular use of mainframe memory to boost workload throughput and CPU efficiency. In this blog entry I want to highlight a DB2 application of Big Memory that, in my experience, is often overlooked: the EDM skeleton pool.

The EDM pool (long name: environmental descriptor manager pool) occupies a portion of a DB2 subsystem's database services address space (also known as DBM1). Various DB2 elements supporting SQL statement execution are cached in different parts of the EDM pool. Among these elements are skeleton package tables, or SKPTs -- one SKPT for a given package. Here's how this space is used: when an application process needs to execute a package, DB2 checks to see if the package (actually, the requisite sections of the package) is already in memory. If it isn't, DB2 loads the package (sections) from the SPT01 table space in the directory into an SKPT in the skeleton pool part of the EDM pool. From there, the package, in being allocated to the thread through which it is to be executed, is copied into another part of the DBM1 address space (the thread-specific copy of the package goes into what is called agent local pool storage -- if a given package is allocated to 10 threads, there will be 10 copies of the package in agent local pool storage in DBM1). The more times a package that is to be allocated to a thread is found in the skeleton pool in the EDM pool, versus having to be loaded from the directory on disk, the better for the overall performance of the associated DB2 workload. Often, this is is not taken into account by people charged with optimizing the performance of a DB2 for z/OS environment.

Here's what you should do (if you haven't already): check on skeleton pool activity by way of your DB2 monitor. You can use an online display of EDM pool information for this purpose, but my preference is to use the EDM pool information found in a DB2 monitor-generated statistics long report (depending on the DB2 monitor product used at your site, this may be called a statistics detail report). In such a report, the EDM pool section would contain some fields like these (I've left out some lines to highlight the information of interest):

EDM POOL                     QUANTITY
---------------------------  --------

  HELD BY SKCT                  40.98
  HELD BY SKPT                3724.36
  FREE PAGES                   175.66

In many cases, a person looking at this information would see the 0.00 value for FAILS DUE TO POOL FULL, conclude that the skeleton pool is plenty big, and move on to something else. Now, it is true that having no program failures caused by DB2 not being able to load a package into the skeleton pool is a good thing; however, that's only part of the story. Look a little further down in the EDM pool section of the report, and you'll see numbers like these:

EDM POOL                     QUANTITY
---------------------------  --------

PT REQUESTS                   6446.1K
PT NOT FOUND                   193.0K 

Take note of the ratio of PT REQUESTS to PT NOT FOUND. The former value is the number of times that a package section was requested for allocation to a thread, and the latter is the number of times that a package request could not be satisfied from the skeleton pool (and so required a loading of the package section from the DB2 directory). The ratio calculated from the numbers above (which show activity in a real-world DB2 subsystem) is about 33:1. That might look OK to you, but you should be able to do a lot better than that on your system (unless you are really memory-constrained). Here are numbers from a different real-world DB2 subsystem:

EDM POOL                     QUANTITY
---------------------------  --------

PT REQUESTS                   1717.9K
PT NOT FOUND                   128.00 

The ratio of PT REQUESTS to PT NOT FOUND in this case is over 13,000 to 1. Looked at from an activity rate perspective, the rate of package section loads from SPT01 on disk associated with the first set of PT REQUESTS and NOT FOUND numbers is about 27 per second (193,000 requests in a 2-hour reporting interval), while the rate of package section loads associated with the second set of numbers is about 1 every 14 seconds (128 requests in a 30-minute reporting interval).

I'd rather have the second set of numbers. And you know what? It often doesn't take much memory to get to a skeleton pool size that yields a really high ratio of package section requests to package section loads from disk. Interestingly, though two systems (used by two different organizations) are represented by the numbers shown above, in both cases the rate of requests for package table sections was a little over 900 per second. An important difference between the two environments is a skeleton pool that is about 2.5 times larger for the DB2 subsystem with the really high ratio of requests to loads versus the skeleton pool that has the much lower ratio of package requests to loads. The size delta amounts to about 6000 pages (4K pages), and that's about 24 MB. The point I want to make here is this: for an investment of maybe a few tens of megabytes of memory, you might be able to dramatically increase the ratio of package section requests to package section loads in your DB2 environment, if that ratio is not already high (and I like to see at least hundreds to one, and ideally thousands to one). Unless your z/OS LPAR is quite storage-constrained, there is a good chance that you could boost the size of the skeleton pool (via the EDM_SKELETON_POOL parameter in ZPARM) by a few tens of megabytes, and still have a demand paging rate for the LPAR that is very low ("very low" is what you want for the demand paging rate in a production z/OS LPAR, and I define that as being a rate -- available from a z/OS monitor -- that is in the low single digits or less per second during busy processing periods). Note that there are also request-versus-load numbers pertaining to the DBD cache in the EDM pool -- I've focused on the skeleton pool numbers because I more often see lower-than-desired ratios there.

That's it. Keep in mind that using Big Memory effectively means using it for all kinds of DB2 performance-boosting purposes -- bigger buffer pools, sure, but don't stop there. Give the skeleton pool enough space to satisfy the vast majority of package section requests out of memory, thereby reducing the rate of package section loads from the DB2 directory on disk. Doing this will get you that much closer to optimal DB2 application performance.

Sunday, September 27, 2015

I Don't Worry About DB2 for z/OS Buffer Pool Hit Ratios

And neither should you.

It's true that for years, mine was among a chorus of DB2 specialists' voices that placed a lot of emphasis on monitoring DB2 buffer pool hit ratios (there are variations with respect to calculating this ratio, but the basic formula is (GETPAGEs - synchronous reads) / GETPAGEs). Some years ago, it dawned on me (and on others -- and for some folks earlier than for me) that I'd been focusing on the wrong buffer pool performance metric. Nowadays, I don't look at buffer pool hit ratios at all.

What I look at instead is the total read I/O rate for each and every buffer pool. Before getting more into the details of that performance indicator, I'll give you some reasons for my not giving consideration to buffer pool hit ratios:
  • They can give you a false sense of being "done" with respect to leveraging buffer pool resources to boost system performance. People can see a value of greater than 99% for a buffer pool hit ratio and conclude, "The performance impact of this pool is as good as it's going to get. I'll turn my attention to other pools." That conclusion can be very much incorrect, as I'll explain momentarily.
  • They can cause you to disregard asynchronous buffer pool read activity. Because a commonly used formula for calculating a buffer pool's hit ratio ignores asynchronous read activity, it can lead to, at best, a benign neglect of prefetch read activity, or worse, actions that overly diminish asynchronous read space in buffer pools for the purpose of "goosing" a synchronous read-based hit ratio. That can work against your overall DB2 performance goals. As I pointed out in an entry posted to this blog a couple of years ago, prefetch reads matter.
  • Depending on how they're calculated, they can have weird values that confuse people. To avoid leaving prefetch reads out of the picture, some performance monitoring products will (or at least have, in the past) use buffer pool hit ratio formulas that take asynchronous read activity into account. On occasion, given a particular level and type of prefetch activity, such formulas will yield odd-looking results, like negative buffer pool hit ratios. Values of this nature can be hard for users to interpret.

[Note that I do care about a particular kind of hit ratio that is relevant to group buffer pools in a DB2 data sharing environment -- something I call the XI GBP read hit ratio. I described this ratio in a blog entry I posted about two months ago.]

I've mentioned that the buffer pool performance metric that matters most to me is the total read I/O rate. For a given buffer pool, that value is calculated as follows:

Total read I/O rate = (synchronous reads/second) + (asynchronous reads/second)

Getting the numbers to plug into this formula is a pretty easy thing. If you can generate a statistics long report (sometimes called a statistics detail report) for a DB2 subsystem by way of your DB2 monitor, look in such a report for the section containing buffer pool activity information. For each active pool in the environment, you'll see the synchronous read rate, and you'll also see the three asynchronous read rates, for sequential prefetch, list prefetch, and dynamic prefetch. Your monitor might already have these rates in per-second form for you, and that makes it really easy: just sum those four numbers (synchronous reads per second, sequential prefetch reads per second, list prefetch reads per second, and dynamic prefetch reads per second). If your monitor reports activity in per-minute form, just divide the four relevant values by 60 to get per-second numbers, and do the previously mentioned addition. You could also get these synchronous and asynchronous read numbers from an online display of buffer pool activity provided by your DB2 monitor, but for ongoing performance tuning, I prefer DB2 monitor-generated reports over online displays.

An alternative means of getting buffer pool activity information is to use the output of the DB2 command -DISPLAY BUFFERPOOL. If you go the command route, my suggestion is to do the following:
  1. Issue the command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL on the DB2 subsystem of interest. You can disregard the output of this issuance of the command.
  2. One hour after issuing the command the first time, issue -DISPLAY BUFFERPOOL(ACTIVE) DETAIL again. The output of this second issuance of the command will capture, for each active pool in the environment, activity for the one-hour period since the first issuance of the command (the command captures activity since the buffer pools were last allocated, or since the command was last issued -- whichever occurred most recently). You can then divide the activity numbers by 3600 to get per-second figures.

In the output of the second issuance of the -DISPLAY BUFFERPOOL command you'll see a timestamp value in a DSNB409I message (for example, "INCREMENTAL STATISTICS SINCE 10:00:32 SEP 1, 2015"). Check that timestamp to verify that the value is approximately one hour since the command was issued the first time. If you see a timestamp that is, say, 25 minutes prior to the time at which the command was issued the second time, it means that someone else got in there 25 minutes ago and issued the command.

Instead of adding together four numbers, as you'd do when using information from a DB2 monitor statistics long report (or from an online monitor display of buffer pool activity), you sum five numbers found in -DISPLAY BUFFERPOOL(ACTIVE) DETAIL command output. Why five instead of four? Because in the command output, synchronous read activity is reported in two "buckets": random synchronous reads and sequential synchronous reads. Total up, then, the random synchronous reads, the sequential synchronous reads, the sequential prefetch reads, the list prefetch reads, and the dynamic prefetch reads, and divide that sum by 3600 to get the total read I/O rate per second (assuming that you issued the command once and then again an hour later, and you're using the output of the second issuance of the command). At some sites, folks have created jobs that issue the -DISPLAY BUFFERPOOL(ACTIVE) DETAIL command at a certain time of day, then issue it again an hour later, and via REXX code parse the output of the second issuance of the command, pulling out the read activity values for each pool and totaling them and dividing the total by 3600 to get the total read I/O rate per second.

What I like to see, for each buffer pool, is a total read I/O rate of less than 1000 per second. A rate of less than 100 per second for a pool is really good, unless that pool is used to "pin" objects in memory, in which case you'd like to see a read I/O rate of zero (and "pinning" pools should be defined with PGSTEAL(NONE), as described in a blog entry I wrote a few years ago). Driving read I/Os down benefits system performance in two important ways: 1) it reduces elapsed time for DB2-accessing programs, and 2) it improves the CPU efficiency of a DB2 workload (every I/O operation consumes some CPU time, so reducing that activity lowers the CPU cost of DB2 data access). The best way to lower DB2 read I/O rates is to increase the size of buffer pools that have a high level of read I/O operations. Thus, monitoring buffer pool read I/O rates points you to the pools for which the potential for performance gains through read I/O reduction is the greatest.

Less than a week ago, I was reviewing buffer pool activity statistics for a DB2 subsystem that is a key component of an organization's IT infrastructure. One of the busier pools showed a hit ratio of 99.3%. That looks pretty good, doesn't it? Seeing that, might you turn your attention to another of the subsystem's buffer pools? You could, but that would be a mistake -- this same buffer pool had a total read I/O rate of over 2800 per second -- higher than that of any other pool belonging to the subsystem. THIS was seen to be the pool most in need of attention, per the marker provided by the read I/O rate. The pool, with approximately 600,000 4K buffers (about 2.4 GB of space), could stand to be at least doubled in size so as to substantially lower the high rate of read I/O activity. Could the organization make that move? Yes, because the LPAR housing the DB2 subsystem has over 100 GB of real storage (an increasingly common configuration these days), and much of that memory is unused. With another 2.4 GB of memory utilized for buffering DB2 pages (in the form of a twice-as-large buffer pool), the z/OS LPAR's demand paging rate (available via a z/OS monitor), currently zero, would likely remain at that level. If doubling the size of the high-I/O pool took the associated read I/O rate from 2800 per second to, say, 1200 per second, I'd make the pool significantly larger again to try to get the read I/O rate below 1000 per second -- while keeping an eye on the LPAR's demand paging rate (a demand paging rate in the low single digits per second or less during busy processing periods is an indicator that a system's real storage resource is not under too much pressure).

Here's the really important point that I want to make through this blog entry: at more and more sites, you find production DB2 for z/OS subsystems running in LPARs with lots of real storage (more than 100 GB -- sometimes several hundred gigabytes). That's a good thing, if you put that big memory resource to productive use. A great way to do that is to grow DB2 buffer pools in a performance-positive way, and the total read I/O rate for the various buffer pools -- NOT the hit ratio -- is your guide for doing this in an effective manner. So, look at the right buffer pool performance numbers, and act on them in the right way.

Monday, August 31, 2015

DB2 for z/OS: Which Buffer Pools Should You Page-Fix?

A few days ago, I delivered a presentation on DB2 for z/OS performance monitoring and tuning for the DBA team at a large financial institution. At one point during the session, an attendee asked this question: "Which of our buffer pools should we page-fix?" Because that's a good question, and because it's a question to which the answer has changed over the years, I decided to make it the subject of this blog entry.

The ability to fix a buffer pool in memory, accomplished through the PGFIX(YES) option of the -ALTER BUFFERPOOL command, was introduced with DB2 for z/OS Version 8. [Note that PGFIX(YES), unlike other -ALTER BUFFERPOOL specifications, does not take effect immediately for an already-allocated buffer pool. Actualizing this change requires deallocation and reallocation of the pool -- something typically accomplished by stopping and restarting the associated DB2 subsystem.] The benefit initially ascribed to page-fixed DB2 buffer pools was cheaper (in terms of CPU cost) I/O operations. How so? Well, if a DB2 buffer is not fixed in memory (and PGFIX(NO) is the default), DB2 has to request that z/OS fix the buffer in memory (i.e., make it non-pageable) every time a table space or index page is read into or written from the buffer (and that is true for reads and writes from and to group buffer pools in a DB2 data sharing environment, as well as reads and writes from and to disk volumes). When the I/O operation is complete, the buffer is "de-fixed" (i.e., made pageable again). Why is this done? It's done so that the buffer won't be stolen out from under DB2 by z/OS in the midst of the I/O operation. One pair of page-fix and page-release operations is pretty inexpensive, but when these happen at a rate of thousands per second, you're talking about a fairly significant consumption of processor resources (and keep in mind that a single prefetch read that brings, say, 32 pages of an object into memory will require 32 page-fix and 32 page-release actions). When the pages of a buffer pool are fixed in memory from the get-go, obviously page-fix and page-release actions are not required for every movement of a page into or out of a buffer, and thus I/O operations associated with a page-fixed buffer pool are more CPU-efficient than they otherwise would be. In-DB2 CPU time can be reduced, in some cases, by several percentage points for programs accessing data in a page-fixed DB2 buffer pool.

So, the original answer to the question, "Which of our buffer pools should we page-fix?" was, "The pools with the highest I/O rates." More specifically, I'd tell people to check on the total read I/O rate for each of the buffer pools of a production DB2 subsystem, and page-fix those pools having a rate of 1000 or more read I/Os per second, and perhaps as well pools with read I/O rates in the high hundreds per second. [The read I/O rate for a buffer pool is the number of synchronous reads plus the number of prefetch reads associated with the pool, expressed as a per-second figure. I/O activity information can be obtained from a DB2 monitor, or from the output of the DB2 command -DISPLAY BUFFERPOOL(ACTIVE) DETAIL. If the command is used, it should be issued once, then issued again an hour later. The statistics in the output of the second issuance of the command can then be divided by 3600 to get per-second figures.] In DB2 Version 8 and DB2 9 environments, specifying PGFIX(YES) for a low-I/O pool wouldn't do much good.

Along came DB2 10 for z/OS, and the answer to the "Which of our buffer pools should we page-fix?" question changed. It changed because of DB2 10's support for 1 MB real storage page frames (versus traditional 4 KB frames) for page-fixed buffer pools (1 MB page frames are themselves made available in a z/OS LPAR via the LFAREA parameter in the IEASYSxx member of SYS1.PARMLIB). When a buffer pool's pages are backed by 1 MB page frames, CPU efficiency is improved because the translation of virtual storage addresses to real storage addresses in that context is a less-costly process versus the 4 KB page frame situation (this thanks to a better hit ratio in what's called the translation lookaside buffer). That being the case, in DB2 10 (and 11) environments, high-I/O buffer pools are joined, as good choices for PGFIX(YES), by high-activity pools, and some in the latter category may not be in the former category. Sure, high-I/O pools are very likely to be high-activity pools (and I think that GETPAGEs per second is a good measure of activity), but a high-activity pool could very well be a low-I/O pool. In fact, a high-activity pool could be a no-I/O pool (no read I/Os, that is): consider a pool, defined with the DB2 10-introduced PGSTEAL(NONE) option, that is used to "pin" (that is, entirely cache) one or more database objects in memory. Ideally, you would see no read I/Os for such a pool once all of the pages belonging to table spaces or indexes assigned to the pool have been read into memory from disk (and for a PGSTEAL(NONE) buffer pool, DB2 will asynchronously read all of an associated object's pages into memory when the object is first referenced after pool allocation). If a "pinning" pool has a high GETPAGE rate, PGFIX(YES) could be a nice CPU-saving move.

The most recent development affecting the answer to the "Which buffer pools should we page-fix?" question is the advent of really large memory resources for z/OS LPARs. I like to see at least 20-40 GB of memory for each engine in a production z/OS LPAR (zIIP engines included, so I would want 160-320 GB -- or more -- of memory for a production z/OS LPAR with four general-purpose and four zIIP engines), and I'm seeing more LPARs that have this Big Memory characteristic. In a z/OS system with memory to burn, you might consider page-fixing ALL of the buffer pools for a production DB2 subsystem. [Here, I'm talking about a situation in which a z/OS LPAR holds a single production DB2 subsystem, and the size of that subsystem's buffer pool configuration is less than half of the LPAR's real storage size. If there are multiple DB2 subsystems in the LPAR, I'd want the combined size of those subsystems' buffer pool configurations to be not more than half of the LPAR's real storage size.] A set-up like this is not blue sky thinking on my part. It's reality. An organization with which I've worked a good bit over the past few years has a production DB2 subsystem in a z/OS LPAR with 9 engines (5 general-purpose, 4 zIIP) and 212 GB of real storage. The DB2 subsystem has a 90 GB buffer pool configuration, and every one of the buffer pools is defined with PGFIX(YES). The demand paging rate for that z/OS LPAR (my preferred measure of the pressure on a z/OS LPAR's memory resource) is zero -- even with 90 GB of the LPAR's real storage fenced off for exclusive use for buffering DB2 data, the remaining 122 GB is more than enough for other memory users in the system. Now, even if you have a z/OS LPAR with a ginormous amount of real storage, don't just change to PGFIX(YES) for all of a production DB2 subsystem's buffer pools at one time. Do that, instead, in stages, and keep an eye on the LPAR's demand paging rate (available from a z/OS monitor). You're good if that rate is zero. If it's non-zero but low (e.g., 1 or 2 per second), you're not in a bad situation, but you might want to put the brakes on additional use of PGFIX(YES) until you can get more memory on the system.

In conclusion, the answer to the question, "Which buffer pools should we page-fix?" depends on the nature of your DB2 environment:
  • If the z/OS LPAR's demand paging rate is on the high side (high single digits or more per second), you might want to stay away from PGFIX(YES) altogether until you can add to the LPAR's real storage resource.
  • If the z/OS LPAR's demand paging rate is low but non-zero (e.g., in the vicinity of 1 or 2 per second), consider page-fixing one or more of your buffer pools that have the highest total read I/O rates.
  • If the z/OS LPAR's demand paging rate is zero and you are running DB2 10 or 11, consider page-fixing the pools that have the highest read I/O rates and the pools that have the highest GETPAGE rates (assuming that you are managing some of the LPAR's real storage resource in 1 MB page frames).
  • If you have a z/OS LPAR with memory to burn (running z/OS on what I would call a muscle machine -- rev it up, baby), and you're running DB2 10 or 11, consider page-fixing all of the buffer pools for a production DB2 subsystem. [And one more thing: if you have one or more REALLY BIG buffer pools (meaning 20 GB or more for a single pool) in a DB2 11 environment, consider backing that pool with 2 GB page frames (if the system has such -- again, this is related to the LFAREA parameter in member IEASYSxx of SYS1.PARMLIB), by way of the new FRAMESIZE option of the -ALTER BUFFERPOOL command. For a pool that's not really big, 2 GB page frames won't make much of a difference, CPU efficiency-wise.]

Hope this information proves to be helpful for you. Survey your DB2 landscape, and leverage buffer pool page-fixing as it makes sense for your environment.

Wednesday, August 12, 2015

What Does a DB2 for z/OS System Look Like When You Have Memory to Burn?

I've long been an advocate of Big Memory (meaning, lots of real storage) for DB2 for z/OS systems. For years, I was disappointed at seeing one production DB2 subsystem after another running in a z/OS LPAR with 16 GB, 20 GB, maybe 40 GB of memory -- enough real storage to run decently, but not enough to let DB2 do its thing with maximum performance. It was like seeing a thoroughbred racehorse in a small coral. You want to see that beast run free in a big pasture, to see what it can do with some real space to work in.

Lately, I have noticed things changing for the better. Memory sizes for z/OS LPARs -- especially the ones that I really care about, which are those that house production DB2 subsystems -- are finally starting to get seriously large. That's large as in more than a hundred gigabytes -- sometimes several hundred gigabytes -- for one LPAR. This change is being fueled by multiple factors:
  • The cost of z Systems memory continues to go down on a per-GB basis. The z13 servers took us another big step in that direction, and if you get enough memory when upgrading your z196 or zEC12 mainframe to a z13 -- what our sales reps call "mega memory" -- then the discount versus the list price of the memory can be downright ginormous.
  • You can get -- and use -- a lot more mainframe memory than you could before. A single z13 server can be configured with as much as 10 TB of real storage (versus a previous max of 3 TB), and up to 4 TB of z13 memory can be used for a single z/OS LPAR (versus 1 TB previously), if you're running z/OS 2.2 -- soon to be available -- or z/OS 2.1 with some PTFs.
  • Organizations are finally paying attention to z Systems memory. Mainframe engines have become really powerful (about 1000 MIPS of processing capacity per CPU), and z Systems configurations were getting a little out of balance for a while, with big-time processing power being paired with too-small real storage resources. Memory sizes are now catching up with engine capacity. People are also increasingly waking up to the fact that Big Memory is somewhat analogous to zIIP engines: it boosts performance and throughput for DB2 workloads (and for Java applications, as well) without impacting the license cost of z/OS software.

Now, some DB2 for z/OS people might be thinking, "What would I do if I had a ton of memory to work with (let's say, at least a few hundred GB in a z/OS LPAR)?" How might I exploit that resource, and what would my DB2 system look like with that resource exploited in a major way?" Well, I'm glad you asked. I think your DB2 system would look something like this:
  • Your buffer pool configuration size is really big, and the total read I/O rate for each pool is really low. Size-wise, given at least a few hundred GB of memory in the LPAR, I'd say that your buffer pool configuration would be at least 100 GB (i.e., the aggregate size of all buffer pools allocated for the DB2 subsystem would be 100 GB or more). In general, when a z/OS LPAR houses a single production DB2 subsystem, I think that a buffer pool configuration size that is 30-40% of the LPAR's real storage size is very reasonable; so, if you have 400 GB in the LPAR, a buffer pool configuration of 120-160 GB should fit very nicely. With a buffer pool configuration of that size, you might see really low read I/O rates for each pool (the read I/O rate for a buffer pools is the rate of all read I/Os for the pool, synchronous plus asynchronous, per second). In my mind, a "really low" total read I/O rate for a given pool is less than 100 per second. That said, with a really big buffer pool configuration you might use some pools for "pinning" certain objects in memory (you'd use the PGSTEAL(NONE) option in that case), and for those pools your target read I/O rate would be zero. Also with a really big buffer pool configuration, you might have one or more individual pools sized at 20 GB or more, and for pools of that size 2 GB real storage page frames (usable for page-fixed buffer pools starting with DB2 11) could deliver additional CPU savings. Finally, with a whole lot of real storage on hand, you might decide to page-fix most, and maybe even all, of your buffer pools, for maximum CPU efficiency.
  • Your packages associated with frequently-executed transactions that re-use threads, and packages associated with batch jobs that issue frequent commits, are bound with RELEASE(DEALLOCATE). For packages bound or rebound in a DB2 10 or DB2 11 system, almost all of the virtual storage associated with those packages when they are allocated to threads for execution goes above the 2 GB bar in the DB2 DBM1 address space, and it uses agent local pool storage versus the EDM pool, so you don't need to worry about running out of space in a virtual storage sense (RELEASE(DEALLOCATE), in combination with threads that persist through commits, increases virtual and real storage utilization). CICS-DB2 thread re-use can be boosted through protected entry threads, IMS-DB2 thread re-use can be increased via pseudo-WFI and/or WFI regions, and DDF thread re-use can be achieved with high-performance DBATs. For relatively simple transactions (those with relatively low in-DB2 CPU times), the combination of RELEASE(DEALLOCATE) packages and thread re-use can reduce in-DB2 CPU time by 10% or more. For batch programs that issue lots of commits, RELEASE(DEALLOCATE) has the added benefit of making sequential prefetch and index lookaside more effective. Note that DB2 11 provided relief for the problem of some bind/re-bind, DDL, and utility operations being blocked by RELEASE(DEALLOCATE) packages executed via persistent LOCAL threads. [If you need to keep RELEASE(DEALLOCATE) packages associated with DDF work from blocking database administration tasks, you can turn off high-performance DBAT functionality via the command -MODIFY DDF PKGREL(COMMIT), and then later turn it back on with the command -MODIFY DDF PKGREL(BNDOPT).]
  • The hit ratio for your DB2 dynamic statement cache is north of 90%. More memory allows for a larger dynamic statement cache, and that means more cache hits and more avoidance of full PREPAREs.
  • All of the RID list processing operations performed on your system are completed using only RID pool space. Starting with DB2 10, two important things related to RID list processing occurred: 1) the default RID pool size went way up (to 400 MB, from 8 MB), and 2) RID list processing operations that can't complete using only RID pool space (because there's not enough of that resource) will continue, using space in 32K-page work file table spaces. Your DB2 monitor (if it supports DB2 10) will show you the extent to which work file space is used for the completion of RID list processing operations that ran out of RID pool space, and if you see such RID list processing "spill-over" activity, you make your RID pool larger (which you can do because the LPAR in which the DB2 subsystem is running has a whole lot of memory). That action allows RID list processing operations to complete in the RID pool, and that boosts performance (versus having to use work file space).
  • You have a really big DB2 sort pool, and that reduces use of work file space for SQL sorts, and that improves SQL sort performance. The sort pool (sized per the value specified for the SRTPOOL parameter in ZPARM) is the amount of in-memory work space that can be used for each concurrent SQL-related sort executing in your DB2 system (so, if the SRTPOOL value is Y, and there are 10 large SQL-related sorts executing concurrently on your system, you could have 10Y of space in the DB2 DBM1 address space used for in-memory processing of these sorts). The default value of SRTPOOL is 10 MB (up from 2 MB prior to DB2 10). Because you have lots of real storage in your z/OS LPAR, you have a larger SRTPOOL value (maybe 40 MB or more), and that means more SQL-related sort work gets done in memory, and that is good for performance.
  • If you run DB2 in data sharing mode, your group buffer pools are large enough so that you have zero directory entry reclaims AND high "XI" GBP read hit ratios. Here, I'm assuming that your coupling facility LPARs, as well as your z/OS LPARs, have lots and lots of memory. Don't know what the "XI" GBP read hit ratio is? Read about it here.
  • With all these big uses of Big Memory, your z/OS LPAR's demand paging rate is still zero, or close to zero. The demand paging rate, my preferred indicator of pressure (or lack thereof) on a z/OS LPAR's real storage resource, is the rate at which pages that had been moved out of memory to auxiliary storage by z/OS (to make room for other pages to be brought into memory) are brought back into server memory on-demand. With a way big real storage resource in your z/OS LPAR, you're able to have a really large buffer pool configuration, lots of RELEASE(DEALLOCATE) packages executed via persistent threads, a big dynamic statement cache, and lots of RID pool and sort pool space, while still leaving plenty of memory for other DB2 and non-DB2 uses. With enough memory to go around, demand paging should be nil or close to it.

So, does your system look like that? If not, why not? Do you not have hundreds of gigabytes of real storage in the LPARs in which you run production DB2 subsystems? If you don't, work on getting there. If you do have a z/OS LPAR with tons of memory and a production DB2 subsystem running therein, and you've not leverage that big memory resource, get to work on that; otherwise, you're missing out on optimal DB2 performance. DB2's a thoroughbred. Give it room to run.

Friday, July 31, 2015

DB2 for z/OS Group Buffer Pools: ALLOWAUTOALT, Directory Entries, and GBP Size

So, I've seen something interesting lately in reviewing some DB2 for z/OS data sharing systems, and as is often the case in such situations, when I see something interesting I want to write about it. Thus this blog entry. Hope you find it to be useful.

What has caught my eye recently in looking over DB2 data sharing configuration information? Big group buffer pool (GBP) directory to data entry ratios. Like 29:1. 30:1. 32:1. 34:1. The default ratio is 5:1. What's going on here?

I'll provide at this point a bit of background information. In a DB2 data sharing system, group buffer pools have a couple of functions. First, they are used to keep track of pages of GBP-dependent objects that are cached in members' local buffer pools, so that the system can mark the copy of page X cached in buffer pool BP3 on member DBP1 invalid if that page is changed by a process running on member DBP2. Second, they serve as a super-high-performance cache for changed data and index pages, so that when member DBP1 sees that the copy of page X that it had cached locally in BP3 has been marked invalid, it can pull the current image of the page from GBP3 (to which the page was written by DBP2 at the time of the commit of a change to the page effected on that member) in a few microseconds. There is usually more page registration activity than GBP page write activity in a data sharing system (not every page read is changed), so the ratio of 5 directory entries (used to register the local caching of pages belonging to GBP-dependent objects) for every 1 data entry (used for caching changed pages of GBP-dependent objects) is a sensible default setting.

There will be times when table spaces and/or indexes assigned to a given buffer pool have a mix of read versus update activity that makes a downward or upward adjustment of the default directory-to-data entry ratio, to, say, 3:1 or 8:1, in the corresponding GBP a reasonable performance tuning action. A ratio greater than 20:1 seems high to me.

On seeing, just a couple of weeks ago, a very high GBP directory-to-data entry ratio in analyzing system information at a mainframe DB2 site, I asked, "Why did you set that ratio to such a high value?" The response: "We didn't do that." I asked, "Is that GBP defined with ALLOWAUTOALT(YES) in the CFRM policy for this data sharing group?" "Yes." Ah, so.

ALLOWAUTOALT can be a useful specification for a coupling facility structure, as it allows the system to dynamically adjust some characteristics of the structure to enhance operational efficiency and availability. When the structure is a GBP, ALLOWAUTOALT can alter the GBP's directory-to-data entry ratio. Why would the system do that, maybe up to a very high value? That happens in an effort by the system to avoid what are called directory entry reclaims. Those can happen when the number of "slots" into which pages of objects assigned to a given buffer pool can go (and that number of "slots" would be all of the buffers in, for example, BP4 on each member of the data sharing group, plus all of the data entries in GBP4) exceeds the number of directory entries in the associated GBP. If BP4 has 50,000 buffers on member DBP1 and another 50,000 buffers on member DBP2, and GBP4 has 10,000 data entries, that's 110,000 "slots" into which, conceivably (though not probably), 110,000 different pages could be placed. If the GBP has 50,000 directory entries, that discrepancy could result in the system having to steal, or reclaim, an in-use directory entry in order to track a page being newly brought into a member's local buffer pool. When such a directory reclaim occurs, the page that WAS being tracked using the reclaimed directory entry is preemptively marked as invalid, wherever it's locally cached. When THAT happens, the page will likely have to be read back into memory from disk when next referenced, and the page read in from disk is likely to be identical to the one that is already cached in memory, but marked invalid, on the reading member (because the invalidation was caused by a directory entry reclaim and not by a page update on another member). Thus it is that directory entry reclaims drive disk reads that would otherwise be unnecessary. That's a drag on performance, and that's why you like to avoid directory entry reclaims (directory entry reclaim activity for GBPn can be checked via the output of the DB2 command -DISPLAY GROUPBUFFERPOOL(GBPn) GDETAIL).

I've seen that with ALLOWAUTOALT in effect for a GBP, the system will often increase the directory-to-data entry ratio for a GBP in an effort to avoid directory entry reclaims. If the ratio were made too high, you could end up with a number of data entries in a GBP that is so small that you get GBP write failures due to lack of storage (also trackable via -DISPLAY GROUPBUFFERPOOL(GNPn) GDETAIL command output). You really don't want GBP write failures to occur, because they can cause pages to land on the logical page list (LPL), and that can lead to failures of programs that try to access LPL pages. The good news here is that ALLOWAUTOALT functions with that admonition in mind (figuratively speaking): I tend NOT to see any GBP write failures due to lack of storage for GBPs that have had their directory-to-data entry ratio automatically adjusted upwards through ALLOWAUTOALT(YES).

That doesn't mean that a significant upwards adjustment of a GBP's directory-to-data entry ratio is completely without penalty, even when no GBP write failures occur. Because more directory entries means fewer data entries (absent an increase in the size of a GBP), a directory-to-data entry ratio increase will result in shorter GBP residency time for changed pages written to the GBP at commit time. That, in turn, leads to a degradation in the one GBP read hit ratio that matters to me: the hit ratio for GBP reads due to buffer invalidation (which I also call the "XI" GBP read hit ratio, for a reason that will be apparent). This read hit ratio can be calculated for a member of a DB2 data sharing group using numbers found in a DB2 monitor statistics long report, or an online monitor display of GBP activity (you can also find information on GBP reads due to buffer invalidations in the output of the DB2 command -DISPLAY GROUPBUFFERPOOL(GBPn) MDETAIL). In an IBM OMEGAMON for DB2 statistics long report, the numbers would look like this:

GROUP BP3                      QUANTITY
----------------------------   --------
GROUP BP R/W RATIO (%)            41.23


The overall GBP hit ratio (highlighted in blue above) is not important to me. Neither is the read hit ratio for GBP synchronous reads due to page not found, which one can calculate using the numbers highlighted in green (it's usually pure luck when a page not found in a local buffer pool is found in the corresponding GBP). The "XI" read hit ratio, calculated using the numbers highlighted in red in the OMEGAMON report snippet, matters. These figures indicate the result of synchronous GBP reads due to buffer invalidation (XI, for short). If there are no directory entry reclaims for a GBP, you KNOW that any local pool buffer invalidations occurred because of the updating of a page of a GBP-dependent page set or partition. Such an updated page would HAVE to be written by DBP2 (if that was the member on which the page-changing action occurred) to the appropriate GBP, and your hope is that the page is still in the GBP when member DBP1 goes there looking for it (and DBP1 will look for the page in the GBP when its locally cached copy of the page, invalidated via member DBP2's update action, is referenced). The longer a changed page stays in a GBP, the better the odds that it will be found there (versus having to be read from disk, which takes MUCH more time than a GBP read) when it is next referenced by a member with an old (since updated) copy of the page in its local buffer pool. More GBP data entries means longer GBP page residency time, and thus more hits for GBP reads due to buffer invalidation. Fewer GBP data entries means shorter GBP page residency times, and a lower "XI" GBP read hit ratio. The numbers in the OMEGAMON report information shown above (displaying activity in a real-world system) provide this "XI" GBP read hit ratio:

39,738.37 / (39,738.37 + 451.85) = 98.87%

That's a very good "XI" GBP read hit ratio. I often see "XI" GBP read hit ratios that are 90% or better (when GBPs are adequately sized), and I don't think that it's unreasonable for you to shoot for a similarly high ratio.

Here, then, is what I'd recommend: check the directory-to-data entry ratio for your production DB2 GBPs. Is it really high for a GBP (I'd say that 20 or more to one is quite high)? If so, how does the "XI" read hit ratio for that GBP look? If that read hit ratio is low (I'd say that below 70% is low, and below 40% is quite low), see if you have enough memory available in the coupling facility LPAR to significantly enlarge the GBP. If you can accomplish that GBP enlargement, see if that allows for a reduction in the directory-to-data entry ratio for the GBP that still provides enough directory entries to avoid directory entry reclaims (auto-alter can lower the ratio dynamically -- if you manually change a GBP's directory-to-data entry ratio via an -ALTER BUFFERPOOL command, that change will take effect the next time the GBP is allocated). Finally, see if those two actions (bigger GBP, lower directory-to-data entry ratio) yields enough additional data entries to substantially improve the "XI" read hit ratio for the GBP. If you do see that read hit ratio improve, you've enhanced the performance of your DB2 data sharing system.

ALLOWAUTOALT is useful in that it imbues coupling facility structures with some added flexibility and resiliency; however, in the case of GBPs, at least, ALLOWAUTOALT should not be thought of as a "set it and forget it" thing. You still need to monitor GBP configuration settings, and understand that when you see a really high directory-to-data entry ratio that was set through ALLOWAUTOALT action, that's the system trying to make the best use of a GBP that is probably smaller than it ought to be. Making good use of too little GBP space is good. Providing more GBP space, in that situation, is better still.

Thursday, July 9, 2015

How Will You Use DB2 for z/OS Global Variables?

About seven years ago, in an entry posted to the blog I maintained while working as an independent DB2 consultant (prior to rejoining IBM), I wrote of the evolution of DB2 for z/OS SQL from merely a data manipulation language to a programming language. That evolution continues with DB2 Version 11, in part through the introduction in that DB2 release of global variable functionality. If you want to know more about this new application enabling feature of DB2 for z/OS, read on.

As SQL (in a DB2 for z/OS context) has become an ever-richer language, the scenarios in which lines of code written in some other language (e.g., COBOL or C or Java) have to be interspersed with SQL statements to accomplish some objective have decreased in number. A few more such scenarios were eliminated thanks to global variables. In particular, consider the situation in which you want to use one query to obtain a value from a DB2 table, which you then plug into a predicate of a second query. You could certainly write a simple routine (in COBOL or some other programming language) that declares a host variable, populates it with a value retrieved by a query of a table, and then issues a second query with the host variable referenced in a predicate. The pseudo-code would look something like this:


Now, you might say, "I don't need COBOL (or Java or whatever) to do that. I can write a native SQL procedure that gets this done using nothing but SQL statements." True enough, but there are scenarios in which use of a SQL PL routine would not be appropriate -- or maybe not even possible. Suppose, for example, that you're working with a tool (Data Studio, perhaps, or the SPUFI interface that comes with DB2) through which one interactively issues SQL statements for execution by a DB2 subsystem. What then? That's where a DB2 11 global variable (if you're in a DB2 11 new-function mode environment) comes in handy. Here are some statements that I executed on one of our IBM test DB2 systems, by way of SPUFI:



FROM SYSIBM.SYSCOLUMNS                        
WHERE TBNAME = ROBSBLOG                       

And the result of that SELECT is:
NAME                   COLNO                  
PRODUCT_ID                 1                  
MONTH                      2                  
UNIT_SALES                 3            

Easy breezy. And, no coding required beyond the SQL statements.

Now, there are a few informational items of which you should be aware regarding the use of DB2 global variables:

The value of a global variable is associated with the thread of an application process; thus, if the value of a global variable is set by a transaction program, the global variable's value will be discarded when the transaction ends and its thread is deallocated. If a global variable's value is set in a batch job, the variable's value will persist across COMMITs issued by the batch job (though it can of course be subsequently changed by the batch job), and will be discarded when the job completes and its thread is deallocated. In the case of my SPUFI exercise mentioned above, to make those SQL statements work as desired I had to set AUTOCOMMIT in my SPUFI session to NO. Had I not done that then the auto-COMMIT after my SET statement (referring to the statement that assigned the value 'PRODUCT_SALES' to the global variable) would have resulted in the assigned value being discarded, and then the statement


would have returned the value 'HELLO' (the default value specified in my CREATE VARIABLE statement).

A DB2 global variable, once created, can be used by any authorization ID or role with the requisite privilege. An ID or role can be granted the READ or WRITE privilege (or both) for a given global variable. The owner of a global variable has, implicitly, all privileges on that variable.

Though a created global variable can be used by any authorization ID or role with the requisite privilege, a global variable's value does not extend beyond a given session (i.e., a given thread). So, if I create a global variable called VAR_X, and I set that variable to the value 'ABC' (assuming that it's a character string variable), another application process (with the right privilege) can use VAR_X, but the value 'ABC' that I assigned to VAR_X will not be picked up in any other session that references VAR_X; so, a global variable itself is globally available within a DB2 subsystem, but a value assigned to the global variable within a session does not extend to any other session.

Some global variables come with DB2 11 for z/OS. Besides global variables that you might create and use, there are several that are already there in a DB2 11 system. These built-in global variables are:


The first of these provides the IP address associated with a network-connected DB2 client application, or of a requester DB2 subsystem that establishes a connection to a remote DB2 host (for a local-to-DB2 application, the value of SYSIBM.CLIENT_IPADDR will be NULL). The other two built-in global variables are for use with the DB2-managed archiving (aka transparent archiving) feature of DB2 11.

There you have it. Yet another in a long line of DB2 for z/OS enhancements that have truly made SQL a programming language. When you get to DB2 11 in new-function mode (or if you're already there), give global variables a try-out, and think about how you might put them to work at your site.

Thursday, June 18, 2015

Are You Using DB2 for z/OS Real-Time Statistics? DB2 Is

DB2 for z/OS is making increased use of the information that it keeps in the real-time statistics tables in the DB2 catalog. You should be doing likewise. Read on to learn more.

For a long time, the only statistics to be found in DB2 "system" tables were those generated through execution of the RUNSTATS utility. Starting with DB2 Version 7 (around the 2000 time frame), DB2 could automatically and periodically (every 30 minutes, by default) write a variety of statistics about table spaces and index spaces to a pair of tables (one for table space statistics, and one for index space statistics), but only if you created those tables. With DB2 9 in new-function mode, the real-time statistics information was moved from user-created tables to a couple of new catalog tables, SYSIBM.SYSTABLESPACESTATS and SYSIBM.INDEXSPACESTATS (I wrote about that change a few years ago, in an entry posted to the blog that I maintained while working as an independent DB2 consultant, prior to re-joining IBM). In those days, real-time statistics were mainly used by DBAs, via vendor tools or by way of user-written programs (the latter often written in REXX, and sometimes involving invocation of the DB2-supplied stored procedure called DSNACCOX), to make more intelligent decisions regarding the execution of utilities such as REORG and COPY. That's still a very good use of DB2 real-time statistics, but what has interested me more of late -- and what I'm writing about here -- is DB2's increasing use of these statistics in support of new autonomic functions, and a number of newer columns in SYSIBM.SYSTABLESPACESTATS and SYSIBM.SYSINDEXSPACESTATS that serve useful purposes beyond smarter utility management. I'll cover these topics now, in that order.

Increased use of real-time statistics by DB2 itself

Rather than try to provide an exhaustive list of each and every usage by DB2 itself of its own real-time statistics, I'll highlight a few that are particularly interesting to me:
  • DB2-managed allocation of utility sort work data sets -- This was a terrific enhancement, in my opinion, and I blogged about it back in 2011. By way of this new feature, built into the DB2 10 base code and retrofitted to DB2 Versions 8 and 9 via PTFs, DB2 utilities such as REORG, LOAD, REBUILD INDEX, and RUNSTATS could dynamically, automatically, and intelligently (using real-time statistics) manage the allocation of the sort work data sets required for their execution. At a growing number of DB2 for z/OS sites, utility JCL and DB2 ZPARM parameter values have been changed to put the DB2 utilities in charge of sort work data set allocation, and every such site that has communicated with me about that change has reported positive results in terms of both utility job reliability (a higher percentage of utility jobs successfully run to completion) and efficiency. That said, at too many sites utility sort work data sets are still being allocated the old way. If that's true where you work, read that aforementioned blog entry and make the changes described therein. It would be a good investment of your time.
  • Automatic calculation by REORG of the appropriate fixed hash space for hash-organized table spaces. DB2 10 for z/OS introduced hash-organized tables -- a row-organizing mechanism that enables super-efficient data access for a query that issues a singleton SELECT containing an "equals" predicate that references a particular unique key of the target table. For hash organization of data to be effective, a table space's fixed hash area must be of a sufficient size; however, you don't want to overdo the sizing of this area, because that would waste space (on disk and in memory). A good way to get a "just right" fixed hash area for a hash-organized table space is to let DB2 figure that out, and that's just what the REORG utility will do -- using real-time statistics -- when you run REORG with AUTOESTSPACE YES (the default) for the table space in question.
  • DB2-calculated space to accommodate table row length increases caused by UPDATEs. Starting with DB2 11 for z/OS, you can set aside space in a table space's pages that will be reserved exclusively for the accommodation of row length-increasing UPDATE statements (previously, space in a table space's pages could be set aside only for accommodation of INSERTs). If in a CREATE TABLESPACE or ALTER TABLESPACE statement you specify PCTFREE FOR UPDATE -1, for the associated table space the initial amount of space set aside in each page (following a REORG or a LOAD REPLACE utility operation) for length-increasing UPDATEs will be 5%, and that amount will subsequently be adjusted by DB2 (for subsequent REORG or LOAD REPLACE jobs) based on real-time statistics information (setting the new ZPARM parameter PCTFREE_UPD to AUTO will make -1 the default value of PCTFREE FOR UPDATE in CREATE TABLESPACE and ALTER TABLESPACE statements).
  • Identification of indexes in need of pseudo-deleted index entry clean-up. Starting with DB2 11 for z/OS, DB2 gets much better at physically deleting index entries that had been marked as pseudo-deleted in the course of row-delete operations. To determine which indexes are most in need of pseudo-deleted index entry clean-up (done by a background process), DB2 periodically checks relevant values in the SYSIBM.SYSINDEXSPACESTATS catalog table. You can read more about this in section 4.2 of the IBM redbook titled, "IBM DB2 11 for z/OS Performance Topics" (downloadable at http://www.redbooks.ibm.com/abstracts/sg248222.html?Open).
As DB2 itself is making ever-greater use of real-time statistics, so should you -- and not just to better manage utility operations (though that, as mentioned, continues to be a good use of real-time stats). Below I'll call attention to some of my favorite columns in the two real-time statistics table spaces.

Some of my favorite SYSTABLESPACESTATS columns
  • DATASIZE -- The number of bytes occupied by rows in a table space.
  • REORGCLUSTERSENS -- Why REORG a table space to reestablish clustering sequence, if that isn't important for the table space? This column provides information to let you make that determination.
  • UPDATESIZE -- This column (new with DB2 11) lets you know of the extent to which a table space is growing (or shrinking) as a result of length-increasing (or length-decreasing) UPDATE operations.
  • LASTDATACHANGE -- When was data in the table space last modified?
  • REORGNEARINDREF and REORGFARINDREF -- If you set aside space in a table space's pages for length-increasing UPDATE operations (a new option, as previously noted, with DB2 11), did that action reduce the occurrence of indirect references that are the result of row relocations caused by update-related row-length increases?
Some of my favorite SYSINDEXSPACESTATS columns
  • LASTUSED -- A great way to identify indexes that aren't doing you any good, and could therefore be dropped (following some due diligence on your part) to free up disk space and to reduce the CPU cost of insert and delete (and some update) and utility operations.
  • REORGNUMLEVELS -- Are levels being added to, or removed from, any of your indexes?
  • DRIVETYPE -- Same information as the identically-named column in SYSTABLESPACESTATS, but for index spaces.
  • REORGLEAFFAR -- Are index page splits causing index leaf pages to be located far from where they optimally should be in an index? If so, would a larger page size make sense for this index?

I encourage you to look over all of the columns of SYSTABLESPACESTATS and SYSINDEXSPACESTATS. I'm pretty sure that you'll find some that will be helpful to you. Take advantage of this information. DB2 is.