Wednesday, January 4, 2012

A MIN, a VIEW, a UNION ALL, and a DB2 SQLSTATE Mystery Solved

Last month, a DBA at an organization with which I work sent me a query that was getting a puzzling result. The query executed successfully in their DB2 for z/OS environment, generating the correct and desired result set; however, it was also getting an SQL warning code on execution. The returned SQLSTATE, 01003, indicated that "null values were eliminated from the argument of an aggregate function." Indeed, the query did contain, in a subquery predicate, an aggregate function (MIN). Here, though, was the puzzling part: the argument of the MIN was a column of a view (I'll refer to it as COL_X), and it didn't contain any NULL values. It COULDN'T contain any null values, as the view was defined on a UNION ALL of two very simple, single-table SELECT statements, and the column in question was defined as NOT NULL in both of the tables involved in the UNION ALL. Just to make sure, the DBA executed a SELECT COUNT(*) with a WHERE(COL_X) IS NULL predicate against the view, and got the expected result: 0. She even added the predicate AND COL_X IS NOT NULL to the subquery with the MIN and STILL got the SQLSTATE warning about null values being eliminated from the argument of the aggregate function.

How could it be that DB2 was telling us that null values were being eliminated from the argument of the MIN function, when that argument contained no null values? The DBA and I racked our brains over this, but could not come up with an answer. I turned for help to a friend in IBM's DB2 for z/OS development organization, and he got us onto the path toward resolution by asking one simple question: "Is it possible that one of the legs [i.e., one of the SELECT statements] of the UNION ALL returns no rows?" Here's why that matters: as it turns out, when the MIN aggregate function is applied to a UNION ALL, it will split into a MIN for each leg of the UNION ALL, and a "parent" MIN to generate the final result. If a leg of the UNION ALL generates an empty result set, the "split" MIN applied to that leg will return a null value (from the DB2 for z/OS SQL Reference: "If the [MIN] function is applied to an empty set, the result is the null value"). In that case, the "parent" MIN will have a null value to evaluate, and as "the function is applied to the set of values derived from the argument values by the elimination of null values" (again quoting from the SQL Reference, with emphasis added), that null value will be eliminated from consideration and the 01003 SQLSTATE code will be returned.

I posed to the DBA the question about the possibility of a leg of the UNION ALL view returning an empty set, and she found that, sure enough, given the predicates applied to the view in the subquery containing the MIN, one leg would in fact return no rows.

Mystery solved. Still, I was curious about the broader question: would the same SQLSTATE warning code be returned for other aggregate functions applied to a UNION ALL with an empty-set leg (and keep in mind: the 01003 SQLSTATE is FYI information -- not an indication of an error -- and the result set returned from the query generating this code is as it should be)? And what if the aggregate function were applied to a UNION DISTINCT with an empty-set leg (and DISTINCT, of course, is the default for UNION, and is in effect unless UNION ALL is specified)? On DB2 9 and DB2 10 systems, I checked this out and found that the 01003 SQLSTATE warning code (along with the correct result) was returned when MAX, AVG, SUM, STDDEV, and VARIANCE were applied to a UNION ALL with an empty-set leg. The 01003 was not received when the COUNT, COUNT_BIG, CORRELATION, and COVARIANCE aggregate functions were applied to the UNION ALL with the empty-set leg. When any of these aggregate functions were applied to a UNION DISTINCT view, the 01003 SQLSTATE warning code was not received (I did not test this with the XMLAGG aggregate function).

Note that even when applying an aggregate function such as MIN or AVG to a UNION ALL that has an empty-set leg, you might not see the 01003 SQLSTATE code, depending on the tool you're using to issue the query. That's because some tools, such as the DB2-supplied SPUFI, report warning as well as error SQLSTATE codes associated with statement execution, while some others, including some PC-based tools, report (by default) only error SQLSTATE codes, and as I mentioned previously, the 01003 SQLSTATE is FYI information, not error information. 

So, there you have it. Thanks to an inquiring DBA, I learned something new. I hope that you've learned something via this blog entry. Happy New Year!

3 comments:

  1. Thanks for the detailed post. I was encountering the same SQL warning message but not from within a UNION ALL clause. It was from a Group By and MAX combination done over some Case-When-Then derived columns from a table function. The logic was attempting to remove null values from an aggregation set.

    The solution was to add [Else '' ] to the Case logic so that nulls would not be injected from the Cross Join in the first plave. This allowed the MAX to do what it was supposed to... Return the specified element for a given Parsed_Table.Row_Number



    Select

    Parsed_Table.Row_Number
    as Ordinal ,

    -- Define Data Type and Precision of each ordered column to be
    -- returned from the parsed table based on the COLUMN_ORDINAL.

    Char( Max ( Case
    When Parsed_Table.Column_Number = 1 -- COLUMN_ORDINAL_PRODUCT_LINE
    Then Element
    Else ''
    End ) , 3 )
    as Product_Line ,

    Char( Max( Case
    When Parsed_Table.Column_Number = 2 -- COLUMN_ORDINAL_ITEM_NUMBER
    Then Element
    Else ''
    End ) , 14 )
    as Item_Number ,

    Dec( Max ( Case
    When Parsed_Table.Column_Number = 3 -- COLUMN_ORDINAL_QUANTITY
    Then Element
    Else ''
    End ) , 5 , 0 )
    as Quantity

    From (

    -- Select the Cartesian product using a Cross Join of
    -- the GP001TF_PARSE_DELIMITED_ELEMENTS Table Function
    -- results for the Parsed_Rows and Parsed_Columns.
    Select

    Parsed_Rows.Ordinal
    as Row_Number ,
    Parsed_Columns.Ordinal
    as Column_Number ,
    Parsed_Columns.Element
    as Element

    From Table( GP001TF_PARSE_DELIMITED_ELEMENTS (
    'FRA|PH8A|5$FRA|PH5|10$FRA|PH3506|15' , -- in_String ,
    '$' ) -- in_Row_Delimiter )
    ) As Parsed_Rows

    Cross Join Table( GP001TF_PARSE_DELIMITED_ELEMENTS (
    Parsed_Rows.Element ,
    '|' ) -- in_Column_Delimiter )
    ) As Parsed_Columns

    ) As Parsed_Table -- combined from Parsed_Rows and Parsed_Columns

    Group By
    Parsed_Table.Row_Number

    Order By
    Parsed_Table.Row_Number
    ;

    ReplyDelete
    Replies
    1. Thanks for posting the information about this scenario, Matthew. Interesting stuff.

      Robert

      Delete
  2. Thank you Robert, this helped me.

    -Walt Grace

    ReplyDelete