Sunday, July 29, 2012

DB2 10 (and 9) for z/OS: I Need 128 GB of WHAT?

The other day, I got a note from a mainframe DB2 DBA who'd seen a system requirement for DB2 10 that had him concerned. This DBA, whose company was preparing for the migration of their DB2 9 subsystems to DB2 10, had read the following in the text of IBM APAR II14564 (an "info" APAR that documents some DB2 9-to-10 migration and fallback considerations):

REAL STORAGE CONSIDERATION
Ensure that you have defined a high enough value for HVSHARE... to satisfy all requests on this LPAR. DB2 Version 10 requires 128GB of contiguous 64-bit shared private storage above the 2GB bar for each DB2 subsystem... If not enough contiguous storage for... HVSHARE exists, the DB2 [subsystem] will abend with S0DC2 RC8A001721 at the first starting.

128 GB is a lot of space. That and the "for each subsystem" bit really jumped out at the DBA. As he pointed out in his note, "We have four subsystems on an LPAR and that would be half a terabyte."

I'll tell anyone who's concerned about this what I told the DBA who contacted me: RELAX. Most important point: this is VIRTUAL storage we're talking about, not real storage. Second most important point: this HVSHARE value (a parameter in the IEASYSxx member of PARMLIB) establishes the BOUNDARY of an area of z/OS virtual storage -- it demarcates the part of z/OS virtual storage that is available for use by subsystems in the form of something called shared private storage. How can something be both shared and private? I'll get to that in a moment.

The 128 GB of HVSHARE space required by DB2 is, more than likely, just a small portion of the total HVSHARE space defined on your system: the DEFAULT value of HVSHARE is 510 TERABYTES. How can any system support 510 TB of HVSHARE virtual storage? Easy: shared private storage does not have to be backed by real storage when it's allocated -- it only has to be backed by real storage as it's USED. The amount of HVSHARE space used by DB2 and other subsystems will probably be a small fraction of what's allocated on the z/OS LPAR. You might see a few hundred megabytes, or maybe a gigabyte or two, of shared private storage used by a production DB2 subsystem -- nothing remotely close to 128 GB (DB2 10 uses more shared private storage than does DB2 9, for reasons noted a little further on in this entry). If you have the current version of IBM's Tivoli OMEGAMON XE for DB2 Performance Expert on z/OS monitor, you can see, in a statistics report or online display, the amount of shared private storage used by a DB2 subsystem. Are you concerned that the default of 510 TB of z/OS virtual storage available for use as shared private storage will leave little virtual storage space for other uses? No need to worry about that: 64-bit addressing enables access to 16 million terabytes of virtual storage. Setting aside 510 of those 16 million terabytes for shared private storage is not a big deal.

Now, how does DB2 use shared private memory? That actually depends on the release of DB2 to which you're referring. You see, DB2 9 was actually the first DB2 for z/OS release to exploit shared private memory (a technology delivered with z/OS 1.5 in the mid-2000s). Here's the story: a lot of data is transferred between certain DB2 address spaces (for example, between the DDF and DBM1 address spaces, and between utility address spaces and DBM1). If said data can be exchanged by way of an area of virtual storage that is part of both of the participating address spaces, the CPU cost of data transfer operations is reduced. An option here would be to use common storage, which is part of every address space. One problem with that approach is that making common storage bigger to accommodate large inter-address space exchanges of data reduces the private portion of virtual storage for all address spaces. Why make the private part of every address space smaller when the data exchanges we're talking about involve just a few DB2 address spaces? Shared private storage offered a better solution: a section of virtual storage in which shared memory objects could be created -- these being areas of virtual storage that can be shared by address spaces, but only by address spaces that register to use a particular shared memory object (that's what makes the virtual storage both shared and private). Beginning with DB2 9, the DDF, DBM1, and MSTR address spaces (and, as mentioned, address spaces in which some DB2 utilities execute) are registered to use the DB2 shared memory object, which is created at DB2 start-up time in the portion of virtual storage reserved for this purpose by way of the HVSHARE specification.

So, if DB2 9 is the release that first exploited shared private storage, how is it that DB2 10 has more people talking about this z/OS resource? The answer, in a word, is threads. One of the big benefits provided by DB2 10 for z/OS is constraint relief as it pertains to virtual storage below the 2 GB "bar." This big free-up of space was achieved by moving almost all thread- and stack-related storage above the bar ("stack" storage is essentially working storage). Most of this thread and stack storage is of the shared private variety; so, the amount of shared private storage usage in a DB2 10 environment will depend largely on the number of threads that are concurrently active in the system. I'll point out here that you need to rebind packages in a DB2 10 environment to get this thread-related virtual storage constraint relief. When you do that, space used for package tables (aka PTs -- the sections of packages copied out of the skeleton package table for each package executed by a particular thread) is moved not only above the bar, but out of the EDM pool (the skeleton package table remains in the EDM pool). Thus is delivered the added benefit of eliminating contention caused by the serialization of updates to EDM pool control blocks used in the execution of packages allocated to threads (this contention in previous DB2 releases was associated with latch class 24).

There you have it. Yes, DB2 10 requires (as does DB2 9) 128 GB of HVSHARE space, so that the shared memory object used by DB2 address spaces can be created when DB2 starts up. No, you don't need to get worked up about this, because 1) the amount of shared private storage used by DB2 is going to be WAY less than 128 GB, and 2) a z/OS LPAR, by default, will have 510 terabytes of HVSHARE space. There are plenty of things in life you can worry about. This shouldn't be one of them.

Thursday, July 12, 2012

DB2 for z/OS: What's With DDF CPU Time?

Recently I got a question from a mainframer who was seeing rather high CPU times for the DB2 DDF address space. DDF, of course, is the distributed data facility -- the DB2 component that handles communications with client systems (usually application servers) that send requests to the DB2 for z/OS server using the DRDA protocol (DRDA being Distributed Relational Database Architecture). Anyway, the questioner wanted to know what he and his colleagues could do to reduce DDF CPU consumption.

This query prompted me to write a blog entry about DDF CPU utilization, because there remains today -- even 20 years after DDF was introduced with DB2 V2.2 -- a good bit of misunderstanding concerning this subject. I believe that the confusion comes from people looking at the DDF address space as they do the other DB2 "system" address spaces (the database services address space, also known as DBM1; the system services address space, or MSTR; and the internal resource lock manager, or IRLM). Generally speaking, these address spaces show very little in the way of CPU consumption (this is particularly true of the IRLM and MSTR address spaces). Here is some information from a real-world mainframe DB2 environment, as seen in a DB2 monitor statistics detail report covering a busy two-hour time period:

                       TOTAL TIME
                     ------------
SYSTEM SERVICES       2:16.529524
DATABASE SERVICES    55:38.969257
IRLM                    21.249774

So, on this system (a big one, with multiple engines, running a large DB2 workload), the IRLM address space consumed 21 seconds of CPU time and the DB2 system services address space consumed just over 2 minutes and 16 seconds of CPU time. The DB2 database services address space consumed considerably more CPU time than these other two (a little more than 55 minutes' worth), but that's not unusual when there's a lot of prefetch and database write I/O activity on a system (true in this case), as the CPU time for such I/O operations is charged to DBM1. That the CPU consumption of these address spaces is relatively low on a system with a great deal of DB2 data access activity is not at all unusual: overall CPU consumption for a DB2 workload is associated primarily with SQL statement execution, and the vast majority of that time is charged to the address spaces (e.g., CICS regions and batch initiators) through which data access requests get to DB2.

Now, during the same two-hour period on the same system on which the numbers shown above were seen, CPU consumption of the DB2 DDF address space was as follows:

                       TOTAL TIME
                   --------------
DDF ADDRESS SPACE  3:16:34.642514

"Three hours and sixteen minutes?!? What's with that? I thought that the DB2 address spaces are supposed to show smaller numbers for CPU consumption! Is there a problem in my system? What should I do?!?"

First of all, you should relax. You can't equate DDF CPU consumption with that of the other DB2 "system" address spaces, because it really is a different animal. IRLM, MSTR, and DBM1 are about data access. DDF is about transaction management (specifically, client-server transactions) -- and that understanding should point you to the reason for sometimes-high DDF CPU utilization: it's driven by execution of SQL statements that get to DB2 through the DDF address space (typically, SQL statements issued from programs running in network-attached application servers, or from DB2 stored procedures called by such programs). Recall that I mentioned, a couple of paragraphs up, that the vast majority of the CPU time associated with SQL statement execution is charged to the "come from" address space. If the statement comes from a CICS transaction program, that time is charged to the CICS region in which the program executed. If the statement comes from a DRDA requester, the CPU time is going to be charged to the DDF address space (and stored procedure usage doesn't change this picture: CPU time consumed by a stored procedure is charged to the address space through which the stored procedure CALL got to DB2). So, in a system in which there is a lot of client-server DB2 activity, you'll see higher numbers for DDF CPU consumption (at plenty of sites, DDF-related activity is the fastest-growing component of the overall DB2 for z/OS workload, and at more and more locations it is already the largest component of the DB2 workload).

Back to the DB2 monitor statistics detail report referenced earlier: the breakdown of address space CPU time tells the story with respect to DDF CPU consumption. Here's the additional information from the DDF address space line that I didn't show before (I'm showing times down to the millisecond level instead of the microsecond level to save space):

       TCB TIME    PREEMPT SRB  NONPREEMPT SRB    TOTAL TIME
        --------    ----------- --------------   -----------
DDF    1:02.659    3:13:21.699       2:10.283    3:16:34.642   

You can see that almost all of the DDF address space CPU time (about 98%) is in the "preemptible SRB" category. This is "user" CPU time, associated with the execution of SQL statements issued by DRDA requesters (work tied to database access threads -- aka DBATs -- is represented by preemptible SRBs, which you can think of as "dispatchable" SRBs). The other 2% of the DDF CPU time is "system" time, related to work done under DDF's own tasks on behalf of the aforementioned "user" tasks. So, DDF is in fact a very CPU-efficient manager of DB2-accessing client-server transactions.

Now, what would you do to reduce the CPU consumption of a DDF address space? You'd take actions to reduce the execution cost of SQL statements that get to DB2 through DDF. Those actions might include the following:
  • Convert external stored procedures to native SQL procedures. This is actually more about reducing the monetary cost of CPU cycles consumed by a DB2 client-server workload, versus reducing the cycles consumed. When a native SQL procedure (introduced with DB2 9 in new-function mode, and available in a DB2 10 new-function mode environment if you're migrating to DB2 10 from DB2 V8) is executed through DDF, around 60% of the associated CPU time will be zIIP eligible (the same is not true for external stored procedures). This zIIP offload will reduce the cost of your DB2 client-server workload, because zIIP MIPS are less expensive than general-purpose CPU MIPS.
  • Take greater advantage of DB2 dynamic statement caching. Often, a lot of dynamic SQL statements are executed through DDF. If you are getting a low hit ratio in your dynamic statement cache (check this via your DB2 monitor), consider making the dynamic statement cache larger (done via a ZPARM change). This assumes that you have enough real storage on your system to back a larger dynamic statement cache.
  • Go to DB2 10 (if you're not already there) and take advantage of high-performance DBATs. This is done by binding packages executed via database access threads with RELEASE(DEALLOCATE). I blogged on this topic last year.
  • Improve SQL statement CPU efficiency by tuning your DB2 buffer pool configuration. This, of course, would beneficially affect all SQL statements -- not just those executed via DDF. I have "part 1" and "part 2" blog entries on this subject.
  • Turn dynamic SQL statements executed via DDF into static SQL statements. We have a nice tool, IBM Optim pureQuery Runtime, that can help you to do this without having to change client-side program code. pureQuery Runtime can also be used to "parameterize" dynamic SQL statements that were coded with references to literal values in predicates -- something that will boost the effectiveness of the DB2 dynamic statement cache. I recently wrote about pureQuery Runtime on my DB2 for z/OS tools blog.
  • Tune your higher-cost DDF-related SQL statements. We have a tool that can help you to do this in a proactive, productive way. It's called the IBM InfoSphere Optim Query Workload Tuner. You can find out more about this product in an entry I posted to my DB2 for z/OS tools blog just last week.

Now you know what's behind the DDF address space CPU times that you're seeing on your system, and what you can do to reduce DDF address space CPU consumption. I hope that this information will be useful to you.