Tuesday, July 29, 2014

Isolating DB2 for z/OS Accounting Data at the WebSphere Application Level

Back in the 1990s, almost all of the DB2 for z/OS-accessing transactional applications I encountered were of the CICS variety. Often, a given DB2 subsystem served as the database manager for multiple CICS-based applications. In such cases, isolation of DB2 accounting information (i.e., database activity figures obtained from DB2 accounting trace records) at the application level was not difficult. DB2 accounting trace records contain multiple identifier fields, and several of these are suitable for separating CICS-DB2 data access information along application lines. For example, one could set things up so that CICS-DB2 application A uses DB2 plan X, while application B uses plan Y. Thereafter, have your DB2 monitor generate an accounting long report (depending on the monitor used, this may be called an accounting detail report) with data aggregated (i.e., ordered or grouped -- again, the term will depend on the DB2 monitor used) at the DB2 plan level, and there's your application-specific view of database activity. You could also have application A run in CICS region X, and application B in region Y, and aggregate information in a DB2 monitor-generated accounting report by region ID (the corresponding DB2 accounting identifier might be referred to as connection identifier by your monitor). It's also easy to separate DB2 accounting information by CICS transaction name (your monitor might call this identifier a correlation name) and by DB2 authorization ID (so, CICS transactions associated with application A might use the DB2 authorization ID X, while transactions associated with application B use the DB2 authorization ID Y). These identifiers are generally usable with online DB2 monitor displays of thread activity as well as with batch-generated accounting reports, and they are as useful for CICS-DB2 applications today as they were 20 years ago.

While CICS-based applications are still a major component of the overall DB2 workload at many sites, lots of organizations have seen access to DB2 from applications running in WebSphere Application Server (WAS) increase at a rapid clip (for some companies, the large majority of DB2 for z/OS-accessing transactions are WAS-based). As is true of organizations that use CICS with DB2, organizations that use WAS with DB2 like to get an application-level view of database activity. How that can be accomplished is the subject of this blog entry.

First, let's look at the simplest situation -- one that's quite common: different WAS-based applications connect to a given DB2 for z/OS system using different authorization IDs. A master data management application might use DB2 authorization ID X, while a customer care application uses ID Y. In that case (as previously mentioned for CICS applications), you have your DB2 monitor generate an accounting long report with data ordered by primary authorization ID, and you're set. Easy. [You can further have your DB2 monitor, if you want, include or exclude data for an accounting report by an identifier such as authorization ID. Additionally, as pointed out already, authorization ID and other DB2 accounting identifiers are generally usable for differentiating data in online monitor displays as well as in monitor-generated reports.]

Sometimes, authorization ID isn't a granular-enough identifier for application-level isolation of DB2 accounting information. That is true when several WAS-based applications connect to a DB2 for z/OS system using the same authorization ID. This situation is not as unusual as you might suppose. More than a few organizations go this route as a means of simplifying security administration. How do these folks get application-specific DB2 accounting information? Let's consider some DB2 accounting data identifier possibilities:
  • Plan name -- This probably won't do it for you. If you're using the type 4 JDBC driver for your application (the one that is used for programs that access DB2 data via the DB2 for z/OS distributed data facility, aka DDF), all applications will be associated with the same DB2 plan: DISTSERV. If you use the type 2 JDBC driver (an option when WAS is running in the same z/OS LPAR as the target DB2 system), it is possible to provide a differentiating plan name for an application, but in my experience people tend to go with the default plan name of ?RRSAF for all type 2 JDBC-driver using applications. [Note that an application running in WAS for z/OS and accessing a DB2 subsystem on the same z/OS LPAR can use either the type 2 or type 4 JDBC driver.]
  • Requesting location -- A possibility, yes, but not if your organization runs -- as plenty do -- multiple applications in one instance of WAS. Multi-application WAS instances are particularly common in a z/OS environment, because a) z/OS LPARs often have a very large amount of processing capacity, and b) the sophisticated workload management capabilities of z/OS facilitate the hosting of multiple applications in one LPAR.
  • Main DB2 package -- Probably not granular enough. WAS-based applications typically issue SQL statements in the form of JDBC calls, and when that's the case the main DB2 package for all applications will be one associated with the JDBC driver.
  • Transaction name -- More than likely, too granular.
  • End user ID -- Also too granular, and perhaps not a differentiator if the same end user utilizes several applications.

At this point you might be thinking, "So, what's left?" I'll tell you what identifier fits the bill for numerous organizations that use WAS together with DB2 for z/OS: workstation name. Truth be told, this did not initially occur to me when I pondered the database activity differentiation question in the context of WAS-based applications that use the same DB2 authorization ID. I got hung up on the term "workstation," and thought of that as being an identifier that would be tied somehow to an actual physical device. Silly me. As succinctly and plainly explained by a WAS-guru colleague of mine, "it's just a string" -- a label. And, it's a string that can easily be set for a WAS-based application, through several means:
  • Via the WAS administration console GUI (in which case it would be an extended property of an application's data source).
  • Via the IBM Data Server Driver for JDBC (the driver provides a JAR file that contains the DB2Connection class, and that class supports the Java API setDB2ClientWorkstation).
  • Via application code, for JDBC 4.0 and above (you would use the Java API setClientInfo).

[Note that with regard to the second and third options in the list above, option three (the setClientInfo Java API) is recommended over option two (the Data Server Driver method), because setDB2ClientWorkstation was deprecated with JDBC 4.0.]

Once the workstation name has been set for your WAS-based applications, you can direct your DB2 monitor to generate accounting reports with data ordered by workstation name, and voila -- there's your application-specific view of database activity (and workstation name should also show up in your DB2 monitor's online displays of thread activity).

Want more information on this topic? You can find plenty -- with examples -- in an IBM redbook titled, DB2 for z/OS and WebSphere Integration for Enterprise Java Applications (downloadable at http://www.redbooks.ibm.com/abstracts/sg248074.html?Open). In particular, check out sections 5.5 and 8.2 of this document.

And one more thing: while I've described workstation name as a means of separating DB2 accounting information along the lines of WAS-based applications, you should keep in mind that identifiers provided by Java client information APIs can also be very useful for workload classification in a z/OS WLM policy.

Some DB2 for z/OS people who are relatively new to the client-server application scene may be a little uneasy about such applications, thinking that they can't monitor and control them as they could the DB2 transactional applications of old. In fact, the monitoring and controlling facilities you want are there. Use them, and rest a little easier.

Wednesday, July 16, 2014

DB2 for z/OS Buffer Pool Enlargement is NOT Just an Elapsed Time Thing

A couple of weeks ago, I got a question from a mainframe DB2 DBA about the impact of DB2 buffer pool enlargement on application and system performance. This individual had requested an increase in the size of a buffer pool on his system, and the system administration team had agreed to implement the change; however, one of the system administrators told the DBA that while the larger buffer pool would improve elapsed times for application processes accessing objects assigned to the pool, no CPU efficiency gains should be expected.

I am regularly surprised at the persistence of this notion that bigger DB2 for z/OS buffer pools do not drive CPU savings. Let me see if I can set the record straight in clear terms: YES, THEY DO. I'll explain herein why this is so, and I'll provide a means whereby you can measure the CPU -- yes, CPU -- impact of a DB2 buffer pool size increase.

I don't think that anyone would dispute that a larger buffer pool will decrease I/O activity (especially read I/O activity) for objects (table spaces and/or indexes) assigned to the pool. The disagreement is over the impact of I/O activity on the CPU consumption of DB2-accessing applications, and on the CPU consumption of DB2 itself. What I've found is that some people believe that a System z server's I/O assist processors handle ALL of the processing associated with I/O operations (this view seems to be more widely held by people who have been working with mainframes for a long time, perhaps because I/O assist processors were a more talked-about feature of the platform back in the day). This is not true. I/O assist processors offload from general-purpose engines a substantial portion -- but not all -- of the work involved in reading and writing data from and to disk. I/O assist processors are great, and they are one reason that System z has long excelled as a platform for I/O-intensive applications, but general-purpose engines (and zIIP engines, for that matter) still have to shoulder some of the read/write load.

Thus it is that a reduction in I/O activity will reduce CPU consumption on a mainframe system. If you enlarge a DB2 buffer pool (to reduce disk reads and writes) AND you change that buffer pool to be page-fixed in real storage (via -ALTER BUFFERPOOL bpname PGFIX(YES)), you'll get even more in the way of CPU savings, because one of the things that a general-purpose engine typically has to do in support of a DB2 I/O operation is fix in memory the page holding the DB2 buffer in question (the one into which data will be read into or written from) until the I/O action is complete, after which the page is released (i.e., made pageable again). This is done so that the buffer won't be paged out to auxiliary storage in the middle of the I/O operation. When a pool's buffers are fixed in memory from the get-go (true when PGFIX(YES) is in effect), the page-fix/page-release actions formerly needed for I/Os are not required, and CPU consumption is reduced accordingly. In a DB2 10 or 11 for z/OS system, you can get even more CPU efficiency benefits from page-fixed buffer pools, because in those environments DB2 will request that a page-fixed buffer pool be backed by 1 MB page frames, versus 4 KB page frames (the LFAREA parameter of the IEASYSxx member of PARMLIB specifies the amount of a z/OS LPAR's memory that is to be managed in 1 MB frames). The 1 MB page frames save CPU by improving the efficiency of virtual storage to real storage address translation.

OK, on now to measuring the effect of a buffer pool change (such as enlarging a pool, or page-fixing the buffers in a pool) on application and DB2 CPU efficiency. For the application-level CPU effect, use Accounting Long Reports that can be generated by your DB2 for z/OS monitor (depending on the monitor that you use, these might be called Accounting Detail Reports). Input to these reports is the data contained in records generated when DB2 accounting trace classes 1, 2, and 3 are active (these records are typically written to SMF). With those trace classes active (and BEFORE you've implemented the buffer pool change), do the following:
  • Generate an Accounting Long Report for a particular day of the week (e.g., Tuesday) and a particular time period. That time period could capture a "peak" of system activity (e.g., 9-11 AM in the morning), or it might be an entire 24 hours -- go with the FROM and TO times that are of interest to you. You can have the DB2 monitor aggregate information in the report in a variety of ways (using an ORDER or GROUP specification -- or something similar, depending on the monitor that you use -- in the report control statement in the SYSIN part of the JCL for the report-generating job). Use the aggregation level (or levels -- you could choose to generate several reports) of interest to you. Want to see the CPU impact on the overall application workload for the DB2 system? Have the data aggregated at the DB2 subsystem level. Want to see the impact for different subcomponents of the workload (e.g., CICS-DB2 work, DRDA work, call attach facility batch work, etc.)? Have the data aggregated by connection type. Note that, by default, a DB2 monitor will typically aggregate accounting information by primary DB2 authorization ID within DB2 plan name -- that is an aggregation that I usually find to be not very useful.
  • Implement the buffer pool change.
  • Generate an "after" Accounting Long Report, for the same day of the week (e.g., Tuesday) and the same time period (e.g., 9-11 AM) as for the "before" report. Use the same aggregation specification as before (e.g., at the DB2 subsystem level). Looking at the "before" and "after" reports, find the average in-DB2 CPU time (also known as the average class 2 CPU time), which is the average CPU time for SQL statement execution. Note that this time will be in two fields: general-purpose engine time, and "specialty engine" CPU time (this is typically zIIP engine time). Do NOT overlook the specialty engine time -- for some workloads, particularly the DRDA workload that comes through the DB2 DDF address space, specialty engine CPU time can be greater than general-purpose CPU time. See how these CPU times (general-purpose and specialty engine) have changed, and there's your effect at the application level (the "average" is per DB2 accounting trace record -- one of these is usually generated per online transaction, and per batch job). If you requested that the monitor aggregate data at (for example) the connection type level, you will have in the accounting report a sub-report for each connection type (one for the CICS connection type, one for DRDA, one for call attach, etc.), and there will be an average in-DB2 CPU time (again, both a general-purpose engine and a specialty engine time) in each of these sub-reports.

The procedure for measuring the impact of a buffer pool change on DB2's CPU consumption (i.e., on the CPU time charged to DB2 tasks versus tasks associated with DB2-accessing application programs) is similar to what I described above:
  • BEFORE making the buffer pool change, use your DB2 monitor to generate a Statistics Long Report for the subsystem (your monitor might refer to this as a Statistics Detail Report). Input to this report is the data in records generated by the "standard" DB2 statistics trace classes (1, 3, 4, 5, and 6). Use the same day of the week and same time period as for the aforementioned Accounting Long Reports.
  • AFTER making the buffer pool change, generate another Statistics Long Report, for the same day of the week and the same time period as before. In the "before" and "after" reports, find the section of the report in which the CPU times for the DB2 address spaces are provided. Look at the CPU times for the DB2 database services address space (the one most affected by I/O activity -- it handles prefetch reads and database writes), and there's your DB2 CPU impact. I say "look at the CPU times" because you should see both a total CPU time for the address space and a field with a name like "preemptable IIP SRB time." The latter is zIIP engine time, and it is NOT included in the former (reported "total" CPU time is general-purpose engine time).

To summarize this blog entry's message: buffer pool size increases should deliver CPU savings on your system, at both the application level and the DB2 subsystem level, by reducing I/O activity. Those CPU savings can be boosted further by page-fixing pools (usually done most effectively for your higher-activity pools), and page-fixed pools save additional CPU when they are backed by 1 MB page frames (automatic in DB2 10 and DB2 11 environments, when LFAREA in IEASYSxx sets aside some of the LPAR's memory resource to be managed in 1 MB frames). When you've made a buffer pool change that should provide enhanced CPU efficiency for your DB2 applications and subsystem, by all means measure that impact. Your best measurement tool for that purpose is your DB2 monitor, and the Accounting and Statistics Long Reports that it can generate.

I hope that this information will be useful to you.

Thursday, June 26, 2014

DB2 for z/OS: the Functional Advantages of Native SQL Procedures

I have been a big fan of DB2 for z/OS native SQL procedures ever since the functionality was introduced with DB2 9, back in 1997. I have posted quite a few blog entries on the topic, the first in 1998 (written while I was working as an independent consultant) and the most recent just last month. In these blog entries, and in presentations I've delivered over the past several years (and in discussions generally), I've focused largely on performance aspects of native SQL procedures versus external stored procedures. These performance pluses (native SQL procedures run in the DB2 database services address space, they run under the task of the calling application process, and they are zIIP-eligible when called by a DRDA requester) are important, but lately I've found myself thinking, more and more, about the functional advantages of native SQL procedures. That's the thrust of this blog post.

Right out of the gate with DB2 9 for z/OS (new-function mode) there was some space, functionality-wise, between native and external SQL procedures: a native SQL procedure could include a nested compound statement, while an external SQL procedure could not (a compound SQL statement, typically the heart of a SQL procedure, is a group of SQL statements, set off by BEGIN and END, in which there can be variable declarations and associated assignments, along with SQL logic flow control statements such as IF, WHILE, and ITERATE). What does support for a compound SQL statement within another compound SQL statement mean to a developer? It means that he or she can create SQL procedures that have multi-statement condition handlers, for more sophisticated processing of exception and/or error conditions that might occur in the execution of the procedure.

The functional advantage of native SQL procedure usage advanced further with DB2 10 for z/OS (new-function mode), which allowed a native SQL procedure -- and only a native SQL procedure -- to have input and output parameters of the XML data type (and to specify the XML data type for variable declarations). Before DB2 10 (and even in a DB2 10 or DB2 11 environment, using anything other than a native SQL procedure), getting an XML data value to a stored procedure required serializing the XML document into character string or CLOB (depending on its size) and passing it to the stored procedure (and working with it in the stored procedure) in that form. Yuck.

DB2 11 for z/OS (in new-function mode) delivers two really cool SQL-procedure-only functional enhancements: array parameters and autonomous procedures. More information on these enhancements follow.

Array parameters

An array parameter is a parameter that contains, essentially, a "stack" of values. Before passing an array parameter to a native SQL procedure (or receiving an array as an output parameter of a native SQL procedure, or declaring and using an array variable in a native SQL procedure), you first have to create the array. Why? Because an array in a DB2 for z/OS context is a user-defined data type (UDT). In creating an array, you have two choices: ordinary and associative. In an ordinary array, elements are referenced by their ordinal position within the array (for example, the third element added to an ordinary array would be referenced as value 3 of that array). Elements in an associative array are referenced by user-provided index values; so, if I assigned an index value of 'Home' to a data value in an associative array containing phone numbers, I could reference a person's home number by using the 'Home' index value.

Here is the CREATE statement for an ordinary array:


In the above example statement, CHAR(6) refers to the data type of the values placed in the array (an example would be employee number '089234'), and 20 refers to the number of values that the array can hold (if no value were there, i.e., if [] had been specified instead of [20] after the keyword ARRAY, the maximum number of values that the array could hold would default to the high positive value for the INTEGER data type, which is 2147483647).

The CREATE statement for an associative array would look like this:


As with the ordinary array, the data type after the AS keyword refers to the data values that will be stored in the array ('Eastern State University' could be an example). The second data type specified for the array (and that second data type is your indication that it's an associative array) refers to index values for the array ('Graduate school - Masters' could be one such value). Note that an associative array, unlike an ordinary array, does not have an explicitly or implicitly specified maximum cardinality -- the cardinality of an associative array is based on the number of unique index values used when elements are assigned to the array.

As mentioned, a DB2 for z/OS stored procedure defined with array-type input and/or output parameters (or with references to array variables in the procedure body) must be a native SQL procedure; furthermore, the CALL that invokes such a SQL procedure can come from only two types of program: another SQL procedure language routine (SQL PL is the language in which SQL procedures are coded, and in which compiled SQL scalar functions can be coded) or a Java program that accesses the DB2 for z/OS server via the IBM Data Server Driver for JDBC and SQLJ type 4 driver. That second program type is really important in my eyes. Java programmers regularly work with arrays, and plenty of those folks had requested, prior to DB2 11's debut, the ability to pass an array to, or receive an array from, a DB2 for z/OS stored procedure.

Autonomous procedures

Consider this scenario: you have a transaction for which you want to record some information for each execution, even if the transaction fails before completion and is rolled back by DB2. A rollback would undo any data changes made by the transaction, right? So, how do you persist some information associated with the transaction? With DB2 11, you can do that with an autonomous procedure, which is a type of native SQL procedure. How would this work? Well, the transaction would call the autonomous procedure, and that SQL procedure would do its thing -- inserting, for example, some data into DB2 table T1. Control would then pass back to the caller, and the transaction would do its thing -- updating, let's say, data in table T2. If the transaction fails after updating T2, what happens? DB2 will back out the transaction's change of T2 data, but the insert into T1 performed by the autonomous procedure will not be backed out. Cool, eh?

What makes a DB2 11 native SQL procedure an autonomous procedure? Technically, it's the specification of the AUTONOMOUS option in the associated CREATE PROCEDURE (or ALTER PROCEDURE) statement. AUTONOMOUS would be used in place of the COMMIT ON RETURN option. Completion of a called autonomous procedure will drive a commit, but that commit will "harden" only the data changes made by the autonomous procedure -- it will have NO EFFECT on any data changes made up to that point by the calling program. This is made possible by the fact that the autonomous procedure's DB2 unit of work is independent from that of the calling application process. Because of this independence, locks acquired by DB2 for an application process are not shared with locks acquired for an autonomous procedure called by the application process. It is therefore theoretically possible that an autonomous procedure will encounter lock contention vis-a-vis its caller. That possibility might influence decisions you'd make about the locking granularity that you'd like DB2 to use for a table space (e.g., row versus page), if an autonomous procedure and its caller will change data in the same table.

The future?

Will future versions of DB2 for z/OS introduce other enhancements that fall into the "SQL procedures only" category? We'll have to wait and see about that, but it sure has been interesting to see the progressive augmentation of native SQL procedure functionality just over the past three DB2 versions. I'll be looking for more of the same, and that's one of the reasons that I'm a native SQL procedure advocate: they get better and better.

Friday, June 13, 2014

DB2 for z/OS: Getting to Universal Table Spaces

Often, there is a bit of a time lag between the introduction of a DB2 for z/OS feature and the widespread adoption of the new technology. Take universal table spaces, for example. These were introduced with DB2 9 for z/OS (almost 7 years ago), but some organizations are only now beginning to convert non-universal table spaces to the universal variety. In this blog post I want to go over the incentives for undertaking a conversion to universal table spaces, highlight the important way in which DB2 10 eased the conversion process, and raise some matters one should consider in the course of effecting table space conversions.

Why universal?

As I see it, their are two main reasons to convert non-universal table spaces to the universal kind. First, it's the only way in which you can leverage the benefits of partition-by-growth table spaces, one of two varieties of universal table space (the other being partition-by-range). A partition-by-growth (PBG) table space -- as the name implies -- is one that is partitioned as needed to accommodate a table's growth; so, if in creating a table space (or altering an existing table space) one specifies a DSSIZE (data set size) of 2G (2 gigabytes) then upon the table reaching 2 GB in size DB2 will add a second partition to the table space. If that partition fills up (i.e, when it reaches 2 GB in size), a third partition will be added to the table space, and so on up to the maximum number of partitions specified for the table space (that being MAXPARTITIONS, an alterable value).

What's good about this? Well, first and foremost it eliminates the 64 GB size limit that previously existed for table spaces that are not range-partitioned -- a PBG table space, like a range-partitioned table space, can reach a size of 128 TB (and that's for the non-LOB data in the table -- with LOB data the data capacity of a table space can far exceed 128 TB). Of course, you might think of a table in your DB2 for z/OS environment that would never approach 64 GB in size, and wonder, "Why should I convert THAT table's table space to universal PBG?" That would be a good question, if size were the only incentive for converting a non-universal table space to universal. There are, in fact, quite a few non-size-related reasons for getting these conversions done. I'll get to these presently, but first I want to clear up a misconception. Some folks think that PBG table spaces are not appropriate for small tables because of the "P" (for partition) in PBG: we've historically thought of partitioning as a means of getting more than 64 GB of data in a table, and so we equate "partition" with "big" and write off PBG for smaller tables. Time to change that thinking. Is PBG a good choice for a table that will never hold more than, say, 20 KB of data (this could be a reference or code table), even with 1G being the smallest allowable DSSIZE value? Sure it is. Will that little table's table space be 1 GB in size, with 20 KB of the space used and the rest wasted? Of course not. The table space's physical size will be determined by its PRIQTY and SECQTY specifications (primary and secondary space allocation, respectively). If those specifications are chosen appropriately, the table with 20 KB of data will occupy 20 KB of disk space. The 1G DSSIZE specification means that IF the table space size reaches 1 GB then DB2 will add another partition to the table space. If the table space size never reaches 1 GB then the table space will stay at one partition. Got it?

OK, on to the non-size related incentives for going universal. A growing list of DB2 features can ONLY be used in conjunction with universal table spaces. These include (and I've indicated the DB2 release through which these features were introduced):
It is likely that future releases of DB2 for z/OS will introduce more features with a universal table space requirement.

Getting there got a lot easier starting with DB2 10

In my universal table space incentive list above I mentioned pending DDL. This is a capability that became available with DB2 10 running in new-function mode. Big picture-wise, what pending DDL made possible was the non-disruptive alteration of a number of aspects of the definition of a table space or table or index. Want to change the SEGSIZE of a table space? The page size of an index? The DSSIZE (data set size) of a table space? No problem. Thanks to pending DDL, you just issue the appropriate ALTER statement and then materialize the change by way of an online REORG. Between the ALTER and the online REORG, is application access to the target object compromised? No. The table space or index affected is placed in the new (with DB2 10) and non-restrictive AREOR state (which basically means that a pending DDL change has been issued for the object but has not yet been materialized via online REORG).

So, what does this have to do with conversion of non-universal to universal table spaces? Well, it so happens that the only pending DDL change possible for a non-universal table space is a change that would, when materialized, result in the non-universal table space becoming universal. For a simple or segmented table space containing a single table, that change is an ALTER TABLESPACE that adds a MAXPARTITIONS specification to the object's definition. Issue such an ALTER, and after a follow-on online REORG the table space will be a universal partition-by-growth (PBG) table space. For a table-controlled partitioned table space, the change is an ALTER TABLESPACE that adds a SEGSIZE specification to the object's definition. Run an online REORG for the object after the ALTER, and voila -- you have a universal partition-by-range (PBR) table space. Easy.

Some considerations

As you plan for the conversion of your non-universal table spaces to the universal variety, there are things to which you should give some thought:
  • The non-disruptive process described above for converting simple and segmented table spaces to PBG universal table spaces (ALTER with MAXPARTITIONS, followed by online REORG) is available, as I mentioned, only for single-table simple and segmented table spaces. For a simple or segmented table space containing multiple tables, you'd have to either go the unload/drop/create/re-load route (with one PBG universal table space created for each table in the multi-table simple or segmented table space), or wait and see if a future release of DB2 provides a non-disruptive universal table space conversion process for multi-table simple and segmented table spaces (this is a known requirement).
  • The non-disruptive process for converting traditional range-partitioned table spaces to PBR universal table spaces (ALTER with SEGSIZE, followed by online REORG) is available, as I mentioned, only for table-controlled partitioned table spaces. For an index-controlled partitioned table space, you'll first need to accomplish the conversion to table-controlled partitioning. That's most easily done via issuance of an ALTER INDEX statement with NOT CLUSTER for an index-controlled partitioned table space's partitioning index (as described in a blog post I wrote a couple of years ago).
  • Materialization of the change to universal table space from non-universal (via online REORG after the appropriate ALTER statement) will invalidate packages that depend on a given table space. These packages will be automatically rebound (by default) when the associated programs are next executed, or you can rebind them explicitly. By default, plan management should be active on your system (i.e., the value of the PLANMGMT parameter in ZPARM should be EXTENDED or BASIC). That being the case, on the off chance that a package rebind operation leads to performance degradation (performance will typically be the same or better following a package rebind), you can very quickly switch back to the previous instance of a package via a REBIND PACKAGE command with the SWITCH(PREVIOUS) option.

There you have it. If you've not already started the process of converting your non-universal table spaces to universal PBG or PBR table spaces, it's time to get that effort going. Your organization will benefit from your actions.

    Friday, May 30, 2014

    DB2 for z/OS: Some Matters Pertaining to Nested DB2 Stored Procedures

    I worked recently with some IT people who were themselves engaged in the development and enhancement of a high-volume application that accesses DB2 for z/OS data by way of stored procedures (the client side of the application communicates with the target DB2 system via the distributed data facility, also known as DDF). Various actions implemented and plans formulated by this IT team serve to effectively illustrate some important points regarding the nesting of DB2 stored procedures ("nesting," in this context, refers to a situation in which stored procedures call other stored procedures). Through this blog entry, I want to bring these points to light.

    Why nest in the first place?

    This basically comes down to what you want in terms of the granularity of the functionality provided by your DB2 for z/OS stored procedures. Suppose that you have a stored procedure that performs a variety of data-related actions -- perhaps inserting some rows into one table, updating rows in another table, and generating a result set that will be consumed by the calling process. If these actions are always to be accomplished every time the stored procedure is called, keeping them in the one stored procedure is probably the way to go for optimal performance and for the sake of simplicity (simple is generally good); however, if it is likely that a typical CALL will end up exercising just one of the stored procedure's data processing functions, separating those functions in several different, smaller stored procedures could be advantageous. Of course, such a break-up doesn't necessarily imply nesting -- client applications could just call the individual stored procedures directly. The question then, is this: at what level do you want small-grain (referring to scope of functionality) stored procedures to be grouped so as to provide larger-grain database services? If you want this grouping to be done by client-side application programs, have those programs call the small-grain stored procedures directly. If, on the other hand, you want the grouping of small-grain stored procedures for larger-grain processing to be handled by higher-level "orchestrating" stored procedures, go the nesting route.

    In the case of the application I mentioned in the opening paragraph of this entry, the IT team wanted the flexibility provided by small-grain stored procedures, but they didn't want client-side developers to have to do the work of "stringing together" small-grain stored procedures to accomplish more-comprehensive data processing tasks. Given that situation, the decision to implement nested stored procedures is understandable.

    When one stored procedure invokes another: the CALL statement

    Because the application to which I've referred is quite dynamic with respect to changes in client-required data processing, the IT team in charge of stored procedure design and development went with a maximally flexible implementation: they created a "parent" stored procedure that would invoke nested, "child" stored procedures with calls of this form (here ":hvar" refers to a host variable):

    CALL :hvar (:hvar, :hvar, :hvar, :hvar) 

    That CALL-coding decision, in turn, dictated the use of an external stored procedure for the "parent" routine, versus a native SQL procedure. Why? Because a CALL in the body of a native SQL procedure must be of the form CALL procedure-name. In other words, a nested stored procedure invoked by a native SQL procedure must be explicitly named (i.e., "hard-coded") in the CALL statement issued by the native SQL procedure.

    COBOL was chosen by the IT team as the coding language for the "parent" stored procedure. It was further decided that the nested stored procedures would be native SQL procedures.

    "Hey, where's my zIIP offload?"

    When the stored procedure set-up went live, the supporting IT folks were surprised to see that the native SQL procedures were getting very little in the way of zIIP engine offload. Aren't native SQL procedures supposed to be majorly zIIP-eligible when they are called by DRDA requesters (i.e., through DDF)? Yes, but that is only true when a native SQL procedure is directly called by a DRDA requester (as I pointed out in an entry I posted to this blog a few months ago). See, a native SQL procedure always runs under the task of its caller. When the caller is a DRDA requester, that z/OS task is an enclave SRB in the DB2 DDF address space, and that makes the called native SQL procedure zIIP-eligible (to the tune of 55-60%). If a native SQL procedure is called by an external DB2 stored procedure, the native SQL procedure will execute under the task of the external stored procedure -- a TCB in a stored procedure address space -- and so will get little, if anything, in the way of zIIP offload.

    Making a result set generated by a nested stored procedure available to a program that is more than "one level up" from the nested procedure

    The aforementioned lack of zIIP offload seen for the nested native SQL procedures was exacerbated by the mechanism used to make result set rows generated by a cursor declared and opened in a nested stored procedure available to a "top-level" calling program (i.e., to a client-side program that initiated a chain of nested stored procedure calls): the result set rows were inserted by the nested stored procedure into a temporary table, and the top-level calling program would then retrieve the rows from that temporary table. Not an uncommon approach, but also not great from a performance perspective, and more complex than you'd like. DB2 10 for z/OS gave us a much better way of getting this job done: ditch the temporary table, and have the nested stored procedure declare its cursor WITH RETURN TO CLIENT (versus WITH RETURN TO CALLER, the formerly only valid specification by which a cursor's result set can be directly fetched only by the direct caller of the result-set-generating stored procedure). When a cursor in a stored procedure is declared WITH RETURN TO CLIENT, the result set rows can be directly fetched by the top-level calling program, no matter how far down the result-set-generating stored procedure is in a chain of nested calls.

    Rethinking things: there's more than one kind of CALL

    Even WITH RETURN TO CLIENT cursors might not have delivered the performance that was desired for the stored procedures of which I've been writing, particularly in light of the fact that, as mentioned, the nested, native SQL procedures were getting very little zIIP engine offload. Of course, the parent stored procedure could have been changed to a native SQL procedure from an external stored procedure, so as to maximize zIIP offload (and minimize general-purpose CPU consumption), but that would have required going with CALL procedure-name invocation of the nested stored procedures, and the IT team really wanted to stay with the CALL :hvar approach. Ultimately, a decision was made to go with an option made possible by the fact that the parent stored procedure was written in COBOL: the nested routines would be changed from native SQL procedures to COBOL subroutines, and they would be invoked as such from the parent stored procedure. That makes sense in this case: if zIIP offload isn't your ticket to minimizing consumption of general-purpose engine capacity, get CPU efficiency by reducing path length -- a COBOL subroutine called by a COBOL stored procedure program should be more CPU-efficient than an equivalent-function nested stored procedure called by that same COBOL stored procedure program (I say "should" because I haven't yet seen the results of the IT team's change to the COBOL subroutine approach, but I expect efficiency gains).

    In conclusion...

    You have plenty of options when you're looking to implement stored procedure-based DB2 database services -- external and native, nested and non-nested, WITH RETURN TO CALLER and WITH RETURN TO CLIENT cursors. There's not a one-size-fits all "right way" that will always be the best way. Consider the particulars of your situation, and make the choices that deliver the flexibility, performance, and functionality that YOU want.

    Tuesday, May 13, 2014

    Some zIIP Things of Which DB2 for z/OS People Should be Aware

    zIIP is short for System z Integrated Information Processor. It's a so-called specialty engine for an IBM mainframe server. The zIIP raison d'etre is reduced cost of computing on the System z platform, accomplished in two ways: 1) zIIP engines cost less than general-purpose processors (often referred to as CPs -- short for central processors), and 2) zIIP engines do not factor into the cost of software that runs in z/OS LPARs. zIIP engines aren't new (they were introduced in 2006), but there are some relatively recent zIIP-related developments that are important from a DB2 for z/OS perspective. I hope that this blog entry will help mainframe DB2 people to keep their zIIP knowledge up to date.

    The circle of zIIP-eligible work continues to expand

    At many System z sites, the number one driver of zIIP utilization is execution of SQL statements that get to DB2 via the distributed data facility, or DDF (i.e., statements that come from DRDA requesters). Two factors emerged in recent years to boost zIIP offload for DDF workloads. First, there was an increase in zIIP eligibility for work executing under enclave SRBs in the DDF address space: a DB2 9 PTF took this zIIP offload percentage from about 55% to about 60%. Second, native SQL PL routines were introduced, also with DB2 9. SQL PL -- short for SQL procedure language -- is the language in which DB2 native SQL procedures are written. A native SQL procedure executes under the task of the application process that calls it. When the calling process is a DDF-connected application, the task is an enclave SRB in the DDF address space. In that case, a native SQL procedure is zIIP-offloadable to the tune of about 60%, as is true of SQL statements in general that are issued by DRDA requesters (in contrast, an external stored procedure always runs under its own TCB in a WLM-managed stored procedure address space, and so is not zIIP-eligible). A number of organizations have been engaged for some time in converting external DB2 stored procedures called by DRDA requesters to native SQL procedures, so as to boost redirection of work to zIIP engines.

    This SQL PL-related zIIP offload booster was expanded in scope when, with DB2 10, user-defined functions as well as stored procedures could be written in SQL PL. UDFs coded in SQL PL, along with UDFs that have a RETURN statement that includes a scalar fullselect, are referred to as compiled SQL scalar UDFs, or, sometimes, as non-inline SQL scalar UDFs. Compiled SQL scalar UDFs, like inline SQL scalar UDFs and SQL table UDFs, run under the task of the UDF-invoking application process, just as do native SQL procedures. As noted previously, when the application process is a DRDA requester, the task in the z/OS system is a DDF enclave SRB, and that makes any invoked SQL UDF -- whether compiled scalar, inline scalar, or table -- about 60% zIIP-eligible.

    Some IBM DB2 utility operations contribute to zIIP engine utilization. In a DB2 9 system, it's the index maintenance activity associated with LOAD, REORG, and REBUILD execution. DB2 10 added RUNSTATS processing to this mix. DB2 11 delivers still greater zIIP offload for RUNSTATS, plus zIIP offload for inline statistics generation performed as an option of LOAD, REBUILD INDEX, and REORG.

    Other DB2-related zIIP utilization drivers include query parallelization (DB2 V8), prefetch read and database write operations (DB2 10), XML schema validation (DB2 10), log reads and writes (DB2 11), and clean-up of pseudo-deleted index entries (DB2 11).

    And don't think that DB2 alone exercises zIIP engines. I'm seeing more and more Java programs running in z/OS systems -- certainly in WebSphere Application Server for z/OS, but also in the form of DB2 for z/OS Java stored procedures, and as batch processes and started tasks that utilize IBM's JZOS Toolkit. Java code executing in a z/OS system can be redirected to another type of specialty engine called a zAAP, but on a mainframe server with zIIPs and no zAAPs, Java programs will drive utilization of zIIPs by way of what's called zAAP-on-zIIP functionality.

    The bottom-line message is this: DB2 itself, with each succeeding release, provides more and more opportunities for shifting work from general-purpose mainframe engines to zIIPs. Java programs, increasingly prevalent in z/OS systems, also boost zIIP utilization. That's good news from a System z cost-of-computing perspective, as it means that you can grow a mainframe workload without having to add general-purpose engines to a system; however, it also means that you need to keep an eye on zIIP engine utilization, because running those engines too hot could have a negative impact on application performance, as explained below.

    Don't over-utilize zIIP engines

    One thing that organizations have long liked about mainframe computers is the fact that you can run general-purpose engines at very high levels of utilization -- like 90% or more -- while still getting excellent application performance and throughput. The story is different for zIIP engines, and here's why: if a zIIP engine is not available when zIIP-eligible work is ready to be dispatched, that work can be directed instead to a general-purpose engine, but such redirection introduces a degree of delay. This delay can affect performance noticeably in a DB2 10 (or later) system, because (as previously noted) starting with DB2 10 prefetch processing became zIIP eligible. If prefetch reads are slowed because of overly high zIIP engine utilization, throughput for prefetch-intensive workloads (think batch, and decision support applications) can be negatively impacted. In a case of that nature, a slowdown in prefetch processing would show up in a DB2 monitor accounting long report (or an online display of thread detail data) as elevated "wait for other read" time (that is one of the so-called class 3 wait time categories).

    Frankly, I'd start thinking about adding more zIIP capacity to a System z server if I saw zIIP utilization regularly reaching 60% or more.

    July 2013: a zIIP capacity boost

    With more and more zIIP-eligible work in z/OS systems, and given the importance of not over-utilizing zIIP engines, you'll want to make sure that zIIP capacity on your mainframe servers stays well ahead of demand for same. That task became easier with an important announcement that IBM made in July of 2013: on a zEC12 or zBC12 server, the number of zIIP engines can be up to two times the number of general-purpose engines (prior to that announcement, the number of zIIP engines on a server could not exceed the number of general-purpose engines). The more heavily weighted your DB2 workload is towards DDF-connected applications (and the more Java code you have running in z/OS LPARs), the more heavily weighted your processing capacity should be towards zIIP engines versus general-purpose engines.

    zIIP engines are enabling organizations to grow z/OS system workloads in an ever more cost-effective manner. That said, you need to keep an eye on zIIP engine utilization. Yes, take actions to drive zIIP capacity utilization (so as to shift work off of general-purpose engines), but take actions as well to ensure that zIIP resources on your system are adequate to handle zIIP-eligible work in an optimal fashion.

    Tuesday, April 29, 2014

    A Tiger Changes His Stripes: I LIKE Java on z/OS

    I well remember a phone conversation, from about six or seven years ago, during which a friend of mine and I discussed an application project at his site which involved use of DB2 for z/OS stored procedures. Me: "In what language will the stored procedure programs be written?" My friend (sounding like someone whose favorite sports team had just lost the championship game): "Java." Me (groaning): "Java? Why Java?" My friend: "Because some 'insultants' told our management that that was the right way to go." I probably winced at that point -- both because I felt his pain, and because I was working at the time as an independent DB2 insultant (er, consultant).

    My disappointment on hearing my friend's news was genuine. Sometime later, I was asked to weigh in on an issue that people with another organization were trying to decide: should some new DB2 for z/OS stored procedures be coded in Java or SQL PL (i.e., SQL Procedure Language, used to create DB2 native SQL procedures). That was a no-brainer, as far as I was concerned. SQL PL was my recommendation because I really like DB2's native SQL procedure functionality, and because I really didn't like the idea of running Java programs in z/OS systems. I was fine with Java code executing in a browser on my laptop, or in an application server on a distributed systems platform, but on a mainframe? No! I saw Java, in the context of a z/OS execution environment, as being a CPU and a memory hog, a waste of processing capacity. Want performance and efficiency from your z/OS programs? You don't want Java.

    Well, that was then, and a funny thing happened on the way to now: I started encountering more and more organizations running Java programs on z/OS systems, and they were getting excellent performance and throughput, and people weren't griping about mainframe resource utilization. How could that be? Had reality been warped in some way? No, nothing that cosmic. Basically, what changed was technology -- IBM System z and z/OS technology, in particular. A lot of the changes were on the hardware side, including:
    • zAAPs and zIIPs to reduce the cost of computing for z/OS-based Java applications. zAAPs (System z Application Assist Processor) and zIIPs (System z Integrated Information Processor) are specialty engines that pack the power and performance of System z general-purpose central processors (GCPs) but do not factor into the cost of mainframe software (that's based on GCP capacity). zAAPs are specifically intended to run Java workloads (as well as handling XML data processing functions). zIIPs, while typically driven primarily by DB2 for z/OS DDF/DRDA (i.e., network-attached client-server) processing, can also be used for zAAP-eligible work when a system does not have zAAP engines (that's called zAAP-on-zIIP). zAAPs and zIIPs have been around for a while, but they've come lately to the forefront in terms of Java processing in large part because of the aforementioned zAAP-on-zIIP capability, and as a result of the recent (July 2013) increase in the allowable ratio of zAAP and zIIP engines to GCPs on a zEC12 or zBC12 server (whereas the number of zAAP or zIIP engines on a mainframe formerly could not exceed the number of GCPs on the server, now for a zEC12 or zBC12 server there can be up to twice the number of zAAPs or zIIPs as GCPs).
    • More and cheaper mainframe memory, and new ways to manage it. I no longer think of Java as a memory hog on z/OS systems, any more than I think of DB2 as a memory hog. Like DB2, Java applications are exploiters of large server memory resources: give them more memory, and they'll perform better and with greater efficiency. Providing Big Memory for Java applications on z/OS is much more feasible and practical now than was the case just a few years ago. A modern mainframe (z196 or zEC12) can be configured with up to three terabytes of central storage, and that memory is far lest costly per gigabyte than it would have been in the not-too-distant past. And, this is not just a more-and-cheaper story -- there's a smarter angle, too, in the form of large frame support. Instead of managing all memory in 4 KB page frames, modern mainframes provide the ability to have some of a z/OS system's memory managed in 1 MB frames, thereby boosting CPU efficiency by reducing the overhead associated with translating virtual to real storage addresses. A JVM (Java virtual machine) in a z/OS system can be set up to use non-pageable 1 MB page frames for heap storage, and (if the server has Flash Express for super-fast paging) pageable 1 MB frames for JIT (just-in-time compiler) work areas. With a zEC12 or zBC12 server, the Java heap can be backed by 2 GB page frames for even greater efficiency.
    • Transactional Execution (TX) facility. This capability, known in the industry as hardware transactional memory, is essentially an in-hardware implementation of optimistic locking for transactional workloads. The TX facility, a feature of zEC12 and zBC12 servers, is aimed largely at boosting the performance of Java application workloads through lock avoidance.
    • In-hardware boosts for decimal floating-point and IEEE binary floating-point operations. Java applications often make use of these operations.
    • Runtime instrumentation (RI). This is a hardware facility, introduced with the zEC12 and zBC12 servers, that delivers a dynamic and self-tuning online recompilation capability for Java applications.

    So, today's zEnterprise servers and the z/OS operating system provide an outstanding execution environment for Java applications. How are organizations implementing DB2 for z/OS-accessing Java programs in this environment? This is what I'm seeing:
    • The IBM JZOS Toolkit. This is a set of of classes, distributed with the IBM Java SDKs for z/OS, that can be used to launch Java applications as batch jobs or started tasks in a z/OS system. Also included in the toolkit are Java methods that facilitate access to z/OS data sources and services such as DB2. More information about the classes associated with the JZOS Toolkit can be found in developerWorks on IBM's Web site.
    • DB2 for z/OS Java stored procedures. Java as a language for stored procedure development has been supported by DB2 for z/OS since Version 5 (late 1990s), but for a long time there was not a lot of activity here, due largely, I think, to the historically negative view of Java on z/OS to which I alluded in the opening part of this blog entry. Nowadays, Java stored procedures are showing up at a growing number of DB2 sites. This trend should get a boost from DB2 11 for z/OS, which added support for running Java stored procedures in 64-bit JVMs (previously, only 31-bit JVMs were supported for Java stored procedures). Not only that, but these are multi-threaded JVMs; so, in a DB2 11 system you'll have a single 64-bit JVM per WLM-managed stored procedure address space in which Java stored procedures run, versus having a 31-bit JVM for each TCB in such an address space. That means a smaller JVM storage footprint, less overhead associated with starting JVMs, and more concurrently active Java stored procedures per stored procedure address space.
    • WebSphere Application Server (WAS) for z/OS. I'm encountering WAS on z/OS a lot more than was the case just a few years ago. An organization with which I work pretty extensively is a case in point: they have a modern, high-volume, business-critical, client-server application that is wholly contained within a z/OS system, with business logic running in WAS and a DB2 subsystem on the same LPAR providing data management (and a good bit of data access logic in the form of stored procedures and user-defined functions). Plenty of other companies have gone, or are planning on going, this same route. One of the interesting aspects of such an environment is the decision presented regarding the choice of JDBC driver for the WAS-based, DB2-accessing Java applications: type 2 (a "local" connection using the DB2 Recoverable Resource Services Attach Facility) or type 4 (a "remote" connection through the DDF address space of the target DB2 subsystem). The type 2 JDBC driver provides the more CPU-efficient connection to a DB2 subsystem in the same LPAR as WAS. The type 4 driver, while adding to connection pathlength (this because the connection involves going into the LPAR's TCP/IP stack and then into the DB2 DDF address space), increases zIIP engine offload for SQL statement execution because the statements run under enclave SRBs in DDF. Which JDBC driver should you use? That will vary by application. For an application with transactions that issue a lot of quick-running SQL statements, the type 2 driver might be a good fit, as there could be a premium in that case on getting to DB2 and back as efficiently as possible. For applications characterized by a smaller number of longer-running SQL statements, the greater zIIP offload for SQL execution provided by the type 4 JDBC driver might tilt you in that direction. The good news? Going from one driver to the other for an application is a pretty simple thing. It's not a big deal to try both of them.

    And so you have it. The positive results that organizations with which I work have gotten in running Java on z/OS systems has caused me to change my tune (or my stripes, as suggested by this entry's title). Tell me now that you're running Java on z/OS, and you won't hear me groan. Instead, I'm likely to say, "That's a fine choice you've made, there." My colleagues at the IBM Poughkeepsie (zEnterprise and z/OS) and Silicon Valley Labs (DB2 for z/OS), who've worked hard to make z/OS an excellent platform for Java applications, would wholeheartedly agree.