Friday, October 28, 2011

IOD Dispatch (2) - DB2 9 and 10 for z/OS SQL, and DB2 Connect Set-Up

In this, my second dispatch from the 2011 IBM Information on Demand conference, more items of information from sessions attended.

SQL enhancements delivered in DB2 for z/OS Versions 9 and 10 - This presentation was given by Chris Crone, a Distinguished Engineer with the DB2 for z/OS development organization at IBM's Silicon Valley Lab. Among the DB2 9 SQL enhancements covered by Chris in the session were the following:
  • INTERSECT and EXCEPT - Given two query result sets that line up in terms of number of columns and data types of columns (i.e., the data type of the nth column of result set 1 is compatible with the data type of the nth column of result set 2), the INTERSECT set operator makes it easy to write a SELECT statement that will return only those rows that appear in both result set 1 and result set 2. EXCEPT is the ticket when you want the SELECT statement to return rows from result set 1 that do not appear in result set 2 (or vice versa).
  • INSTEAD OF triggers - I liked Chris's succinct description of this DB2 9 enhancement: "These are triggers on views." Basically, an INSTEAD OF trigger can be used to enable the updating of data through a view, when the data-change operation in question (INSERT, UPDATE, or DELETE) would be technically impossible or practically unfeasible in the absence of the INSTEAD OF trigger. For example, an INSTEAD OF UPDATE trigger could be used to change an employee's last name through an UPDATE targeting a view that joins the EMPLOYEE table to the DEPARTMENT table (this to bring in from the DEPARTMENT table the name of a department referenced only by department number in the EMPLOYEE table). Such a view is read-only by definition, but the INSTEAD OF trigger enables the UPDATE targeting the view to execute successfully, by changing the action to an update of the EMPLOYEE table.
  • MERGE - Using this statement, one can change data in a target table based on a comparison of target table rows with "rows" in a virtual "input table" ("input table" content is provided via array variables -- one for each "column" of the "input table"). When an "input table" row matches one or more rows in the target table (per matching criteria specified in the MERGE statement), the target table row (or rows) is updated with values in the matching "input table" row. When there is no target table match for an "input table" row, that row is inserted into the target table.
  • SELECT FROM UPDATE (and FROM DELETE and FROM MERGE) - This enhancement rounds out the SELECT FROM INSERT capability introduced with DB2 Version 8. The same concept applies: with one statement (versus separate SELECT and data-change statements), change data in a table and retrieve information about the data change operation. Chris showed how an INCLUDE column (a column, defined and assigned values via a SELECT FROM UPDATE/DELETE/MERGE statement) could be used to determine whether rows in the "input table" of a MERGE statement (see my description of MERGE, above) were inserted into the target table or were used to update target table rows.
  • TRUNCATE - This is a statement that can be used to clear data out of a table. Why use TRUNCATE versus a mass delete (that being a DELETE statement with no WHERE clause)? Here's one reason: TRUNCATE can be used to empty a table without causing delete triggers to fire (if that's what you want).
  • New OLAP functions - These functions -- RANK, DENSE_RANK, and ROW_NUMBER -- make it easy for a person to code a SELECT statement that will a) sort a query result set, in ascending or descending sequence, according to a user-specified single- or multi-column value (and this sort takes place BEFORE the sort for an ORDER BY clause that might appear at the "end" of the query), and b) assign an integer value to each row in this sorted result set, starting with 1 and counting up (as Chris put it, one can think of the OLAP specifications "as counting functions that count in different ways." The difference referred to in Chris's statement has to do with how "ties" (referring to the OLAP specification sort key value) are handled: ROW_NUMBER assigns an integer value of n to the nth row in the sorted result set, regardless of whether or not the sort key value in that row is the same as that of the previous row in the sorted set (in other words, "tied" rows get different ROW_NUMBER values, so if the result set is sorted in descending SALES order, and if rows four and five in the set have the same SALES value, the rows will get ROW_NUMBER values of 4 an 5, respectively). RANK and DENSE_RANK assign equal integer values to "tied" rows, but differ in how a row after a set of "tied" rows is numbered: RANK will skip integer values so that next row after a set of "tied" rows will be assigned a RANK value of n if it is the nth row in the result set. DENSE_RANK, on the other hand, will assign a rank of n+1 to the first row after a set of "tied" rows if those "tied" rows were assigned a rank of n. So, if rows three, four, and five in a set sorted by SALES have the same SALES value, they will each get a rank value of 3, whether the OLAP specification is RANK or DENSE_RANK. The next row in the set (row six) will get a rank value of 6 if the RANK specification is used, and a rank value of 4 if DENSE_RANK is specified (i.e., there are no "gaps" in rank values used if DENSE_RANK is specified).

In the second part of his presentation, Chris described SQL enhancements delivered in DB2 10 for z/OS, including the following:

  • Richer OLAP specifications: moving aggregates (e.g., moving sum and moving average) - These specifications enable one to code a SELECT that will partition a result set by a user-specified single- or multi-column value, sort within result set partitions by a user-specified single- or multi-column value, and then aggregate column values within partitions in a "moving window" fashion. For example, SALES values in rows in a partition might be averaged in this way: take a row's SALES value and average it with the SALES values of the two preceding rows in the partition. Row 1 in a partition has no preceding rows, so its "moving average" value would be the SALES value in the row. The "moving average" value for row 2 in a partition would be the average of the row 2 and row 1 SALES values (there is only one preceding row in that case). The "moving average" values for rows 3 through n in a partition will be the average of a row's SALES value and the SALES values of the two preceding rows in the partition. You could also generate other types of "moving" aggregates, such as moving counts and  moving sums.
  • TIMESTAMP enhancements - A DB2 10 timestamp value can have up to 12 digits of precision for the fractional second part of the value (that gets you to the picosecond level -- trillionths of a second). Additionally, the precision of the fractional second part of a timestamp value can be user-specified, with the range of the user-specified precision being 0 to 12 digits (6 digits is the default).
  • TIMESTAMP WITH TIME ZONE - This new optional specification for the TIMESTAMP data type allows a time zone value to be stored with a timestamp value (the time zone value is an offset from UTC, formerly known as Greenwich Mean Time). To go along with the new WITH TIME ZONE option for the TIMESTAMP data type, there are new special registers (e.g., SESSION TIME ZONE), and the built-in date/time functions (e.g., EXTRACT) have been extended to support TIME ZONE values.
  • Implicit data type casting - In a DB2 10 system, you can concatenate a character string value with a numeric value without having to first explicitly cast the numeric value as a character string -- that casting will be done for you by DB2. Similarly, DB2 10 will let you provide the character string representation of a numeric value as the argument for a built-in function such as FLOOR -- that character string will be implicitly cast as a numeric value by DB2. 
  • Extended indicator variables - Use of extended indicator variables is limited to host variables used to input data to DB2 (i.e., they can be used with host variables specified in INSERT, UPDATE, and MERGE statements); furthermore, extended input variables have to be enabled via the package bind option EXTENDEDINDICATOR(YES) or via the WITH EXTENDED INDICATORS option of PREPARE. An extended indicator value of -5 means that the target column for the associated host variable is to be set to its default value. An extended indicator value of -7 means that the target column for the associated host variable is to be treated as if it had not been specified in the statement.
  • Row permissions and column masks - Defined via CREATE PERMISSION and CREATE MASK, respectively, these are essentially table-level rules that control access to data rows and columns. Row permissions and column masks are better than views when it comes to establishing security as it pertains to row and column access, because they become part of the associated table's definition.
  • Cursors declared WITH RETURN TO CLIENT - When a cursor is declared in a stored procedure with this new option, the result set of that cursor can be retrieved by the "top-level" program that initiated a set of nested stored procedure calls, even if the stored procedure in which the cursor is declared is several levels down from the top-level program (WITH RETURN TO CALLER, formerly the only form of WITH RETURN for a cursor declaration, allows a result set to be retrieved by a program that is "one level up" from the stored procedure in which the cursor is declared, referring to levels of nested stored procedure calls). In talking about the WITH RETURN TO CLIENT option of DECLARE CURSOR, Chris noted that DB2 10 supports nested stored procedure calls up to 64 levels deep (versus a limit of 16 levels in prior releases of DB2).
  
Optimizing DB2 Connect deployments - This session was delivered by Brent Gross, a member of the DB2 for Linux, UNIX, and Windows development team at IBM's Toronto Lab. Brent noted, among other things, that with respect to DB2 clients, you want to go with "the lightest, smallest package possible." He recommends the Data Server Driver Package (aka the DS Driver) in most cases.

Brent also mentioned that a direct DB2 client to DB2 for z/OS server connection is preferred, in most cases, to a set-up that has DB2 clients accessing DB2 for z/OS through a DB2 Connect gateway. The main benefits of direct-to-DB2 connections are 1) simplified network topology and 2) improved performance. Regarding that second point, Brent said that elapsed time improvements of 15-30% have been observed when direct connection of clients to DB2 for z/OS has replaced a set-up that had client-to-DB2 connections going through a DB2 Connect gateway (OLTP workloads see the greatest performance gains).

Brent pointed out that there are some situations in which client connections to a DB2 for z/OS server have to go through a DB2 Connect gateway:
  • When a client-side transaction manager is using two-phase commit implemented through a "dual transport" model (Tuxedo and Encina are two transaction managers that implement two-phase commit in this way).
  • If homogeneous federation is to be used (this refers to the ability to create nicknames across DB2 and Informix servers).
  • When DB2 Connect Enterprise Edition is licensed with the "concurrent users" option.
Brent explained that direct-to-DB2 versus DB2 Connect gateway is a configuration issue, and that you need to be licensed for DB2 Connect in either case. He noted that DB2 Connect Unlimited Edition licensing makes life easier for systems support people because license key information is maintained at the DB2 for z/OS end of things (in other words, the DB2 Connect entitlement check is performed at the mainframe DB2 host, so license keys don't have to be stored at the DB2 client end).

Also in support of the general recommendation that DB2 clients connect directly to a DB2 for z/OS server, Brent pointed out that this approach carries with it no penalties with regard to functionality versus the DB2 Connect gateway alternative (for example, sysplex workload balancing, available for years for Java applications directly connected to DB2, was made available for .NET and ODBC and CLI clients with version 9.5 Fix Pack 3 of the DB2 client code).

For organizations looking to migrate from a DB2 Connect gateway to a direct client-to-DB2 set-up, Brent recommended starting with application servers (versus "fat client" workstations).

A few other items of information provided by Brent during the session:
  • DB2 10 for z/OS enables maximum connection, thread, and timeout values to be specified at the application server level via server profiles.
  • Full support of DB2 10 functionality for DRDA requesters requires the use of Version 9.7 Fix Pack 3a (or above) of DB2 client code.
  • For maximum availability in a DB2 for z/OS data sharing environment, dynamic virtual IP addresses (DVIPAs) should be assigned to DB2 member subsystems, and clients should connect to the data sharing group using the distributed DVIPA that is assigned to the Sysplex Distributor component of z/OS.
Signing off now. I'll post one more IOD 2011 entry within a day or two (and I mean within a BUSINESS day or two -- gotta chill over the weekend).

    No comments:

    Post a Comment