Read an Excerpt
SQL Built-In Functions and Stored Procedures
The i5/iSeries Programmer's Guide
By Mike Faust MC Press
Copyright © 2005 Mike Faust
All rights reserved.
ISBN: 978-1-58347-738-0
CHAPTER 1
SQL on the iSeries
Structured Query Language (SQL) is the industry standard language for database access. While the American National Standards Institute (ANSI) has a defined SQL standard, within that standard, each relational database platform has its own unique "flavor" of SQL. This can include things as simple as command syntax and as complex as supported functions. In this chapter, we'll explore what makes the iSeries implementation of SQL unique. This chapter acts as an introduction to the SQL functionality on the iSeries. If you're already somewhat familiar with that functionality, you may want to skip ahead to chapter 2.
Software Requirements
To use the examples in this book, you'll need to have the DB2 Query Manager and SQL Development Toolkit licensed program installed on your iSeries. This licensed program gives you a handful of useful tools, including the Query Manager report writing tool as well as the precompilers, which allow you to embed SQL code within other programming languages (SQL RPG, for example).
This licensed program also gives you the STRSQL interactive SQL command line utility. This utility comes in handy if you need to run ad hoc queries (i.e., action queries to modify records in a database on the fly). Throughout this book, many of the examples we cover will be executed from within this utility. Later in the book, we'll also be exploring the use of functions and procedures from client applications developed in Microsoft Access, Visual Basic (VB), and Active Server Pages (ASP).
SQL Tools
Throughout this book, we'll use several utilities (both GUI and green -screen) that make it easier to work with SQL statements on the iSeries. The STRSQL command can be executed from an iSeries command line. This command starts the interactive SQL utility. When you enter this utility, press the F4 key to display a list of the available SQL commands that can be executed from within the interactive SQL session. When a SELECT statement is entered, the resulting data can be either displayed to the screen or sent to a printer or output file. This selection is made by pressing F13 from within STRSQL to display the Interactive SQL Session Services menu and selecting option 1 for Current Session Attributes. Figure 1.1 shows the screen that is displayed.
This screen shows that option 1 has been selected for the "SELECT output" field. If you select option 2, you'll be prompted for the printer name. If you select option 3, you'll be presented with parameters for the file and library to contain the output data. Notice that you can specify other options here as well, such as the data and time format and the naming convention.
iSeries Navigator also includes an SQL tool. You can access this utility by first expanding the Databases option and clicking on your system name and then selecting Run an SQL Script link from the right side of the lower pane in iSeries Navigator. The window shown in Figure 1.2 will be displayed.
You can use this screen in much the same way that you'd use STRSQL. The Examples drop-down list contains templates of SQL statements to perform common tasks. Later, we'll examine using this application to create functions and procedures. As we examine sample SQL statements throughout this book, either of these two tools can be used.
Exploring SQL Statements
Let's take a look at the components and functions of the various SQL statements.
Starting with the Basics
Probably the single most common individual SQL statement is the SELECT statement, which allows you to define a set of data to be retrieved from a database. Figure 1.3 shows the basic structure of a SELECT statement.
Within this example, the FROM clause is required. However, the other clauses shown are optional. The field_list value would be replaced by a comma-delimited list of fields or simply an asterisk to return all fields from the defined table(s). The table_definition value can be a list of joined tables, along with a definition of how they are to be joined, or simply a physical or logical file name. (We'll examine joining tables a little later on.) The condition_expression on the WHERE clause determines exactly which records will be returned from the defined table or tables. The group_field_list used with the GROUP BY clause indicates that the data set returned is to be summarized by the list of fields provided. When a GROUP BY clause is specified, all of the column names appearing in the field_list either must appear in the group_field_list or must be a column function (SUM, AVG, MIN, MAX, etc.). The sort_field_list defined on the ORDER BY clause defines the sort order for the result set. The group_level_conditions defined on the HAVING clause is used in much the same way that the WHERE clause is used except that the condition defined here is checked only after a GROUP BY is processed. This allows you to use a column function as part of a condition.
As you can see, the SELECT statement can take on many forms. The code below shows a very simple form of a SELECT statement.
SELECT
FROM QSYS2.SYSTABLES
This example uses the SQL naming convention in place of the SYS file naming convention, which would use the format library/file. This statement would return all columns (or fields) from the file SYSTABLES in library QSYS2. A slightly more complex version of this statement could be used to select only required fields under specified conditions, as shown below:
SELECT TABLE_NAME, TABLE_TYPE
FROM QSYS2.SYSTABLES
WHERE TABLE_SCHEMA = 'QSYS'
ORDER BY TABLE_NAME
This example would display only the columns TABLE_NAME and TABLE_TYPE for records with a TABLE_SCHEMA value of 'QSYS'. The result set would be sorted by the TABLE_NAME column. To further elaborate on this example, we can add grouping clauses to create a summary result set, as shown below:
SELECT TABLE_SCHEMA, COUNTY(*) AS TABLE_COUNT, SUM(ROW_LENGTH) AS
TOTAL_ROW_LEN
FROM QSYS2.SYSTABLES
WHERE TABLE_TYPE = 'P'
GROUP BY TABLE_SCHEMA
HAVING COUNT(*)>10
This example would display library names (TABLE_SCHEMA) and the number of physical files within each library, along with the total length of rows between all tables within the library. The HAVING clause is used to include only libraries containing more than 10 physical files.
A slight variation of the SELECT statement can be found in SELECT INTO. This statement is used to pass values from a defined SELECT statement into host variables in a stored procedure or embedded SQL within an RPG program.
When using SELECT INTO, no more than one row can be returned at a time. This means that your statement must return results containing unique results.
While the SELECT statement is used to retrieve data from a database, other statements are used to manipulate a database. The DELETE statement, as its name would suggest, is used to remove rows from a table. Figure 1.4 shows the syntax used with the DELETE statement.
This statement takes on a somewhat simpler form than the SELECT statement, but the FROM and WHERE clauses function exactly as they do with the SELECT statement. Below is a sample DELETE statement.
DELETE FROM MYLIB.MYFILE WHERE RCSTAT = 'D'
This statement shows the code that would be used to remove all records from the table MYFILE where the field RCSTAT is equal to 'D'.
The UPDATE statement takes on a similar structure to the DELETE statement. Figure 1.5 shows what this structure looks like.
You'll notice that we don't precede the table_definition value with the FROM clause here. The SET clause is used to define the field to be modified and the value to place in that field. As it does in the SELECT and DELETE statements, the WHERE clause filters which records will be affected.
The INSERT INTO statement adds new records to a table. Figure 1.6 shows the syntax used with this statement.
The table_definition value identifies the table into which the records will be inserted. This value can optionally be followed by a list of fields to be populated. If this list is not supplied, it is assumed that values will be provided for all fields within the destination table. The values to be inserted can by supplied using one of two methods:
1. A VALUES list, containing the specific values to be inserted into a single row
2. A SELECT statement that will return one or more rows to be inserted into the table
The number of values specified must match the number of fields to be inserted into. This is true for both the VALUES clause and the SELECT statement option.
Slightly More Complex
Each statement examined here becomes somewhat more complicated when you take multiple files into consideration. The method used for retrieving data from multiple tables is different for each of the statements we've examined so far.
The SELECT statement uses the JOIN clause to define multiple tables as its data source. The JOIN clause has several modifiers that define the type of join to be used. Table 1.1 below breaks down each of these options and describes how each one is used.
Which of these joins you use depends on what your intentions are. An INNER JOIN would be used in circumstances where you know that matched records will exist in the tables in both sides of the join or where you want to see only the matched records. This means that if the table on the left side of the expression does not find a matching value in the table on the right side, the record will be omitted. This also means that if a value exists in the table on the right side and doesn't have a matching value in the table on the left, the record will also be omitted. A good example of how this is used would be an order header file with line detail records. Figure 1.7 shows an example of this type of statement.
This example would return all fields from ORDHEADR and all fields from ORDDETL where the field ORDNO is matched between the two tables.
Table 1.2 shows a graphic example of how the two tables would combine into the resulting data set.
As you can see in this example, when the tables are combined, the field values from ORDHEADR are repeated when multiple matching values exist in ORDDETL. You'll also notice that the row in ORDHEADR where ORDNO is 12348 does not have any matching values in ORDDETL; the row is omitted from the result set.
A left join includes all of the records from the table specified on the left side of the join and any matched records from the right side. This means that even if a record in the table on the left side of the join doesn't have a matching record in the table on the right side, the record is still included. An example of a situation where this might be used would be to retrieve comment records to be matched to a header record. In this type of situation, the comment records may or may not exist. Figure 1.8 shows an example of this.
This example will include all rows from the ORDHEADR table as well as any matching records from the COMMENTS table. Table 1.3 below shows how this data is combined.
As you can see in this example, any records in the table on the left side of the expression that do not have any matching records on the right side of the expression will still appear in the result set. However, the field values from the right side of the expression will be null. A right join works in an opposite fashion to a left join. Figure 1.9 shows an example of the same statement coded as a right join.
The results returned by this expression will be identical to the results in Table 1.3, except that the field order will appear differently because the primary table is now on the right side of the expression.
The LEFT and RIGHT EXCEPTION JOIN expressions are used to find mismatched records between a primary table and a secondary table. Figure 1.10 shows an example of a SELECT statement using a LEFT EXCEPTION JOIN expression.
This example will display any records in the CUSTOMERS table that do not have any matching records in the SALES table. Note the letter following each of the table names. This represents an alias for each table, which allows us to use a simpler name to refer to the table within the rest of our statement. This can be especially useful when executing statements between tables with many common field names.
The CROSS JOIN expression combines all records from one table with all of the records from another. Because of this, a cross join doesn't require that join fields be defined. Each time a record is read from the table on the left side of the expression, all of the records from the table on the right side of the expression are returned. Figure 1.11 shows how a CROSS JOIN expression is used.
When this statement is executed, it can result in a large volume of data being returned. Table 1.4 shows an example of how this data is combined.
As this example illustrates, the number of records returned by this statement would be equal to the number of records in the first table multiplied by the number of records in the second table.
Each of the join types we've examined here serves its own unique purpose. It's also possible to execute an UPDATE statement that uses multiple files; however, this is not accomplished using the JOIN clause. A sub-SELECT must be used to retrieve data from a secondary table or table. There are two distinct scenarios in which you'd use a sub -SELECT: within a criteria expression (WHERE clause) or as part of the value to be updated. In cases where the criteria or a portion of the criteria for records to be updated exists within another table, the solution is to replace the criteria value with an SQL SELECT statement. A good example would be when you want to update a value in an order line detail file based on a value from the order header, as shown in Figure 1.12.
In this example, the sub-SELECT will return only records in which the order date from the header is less than 20030101. The WHERE clause from the UPDATE statement will find a match only for those records returned by the sub-SELECT shown. The WHERE clause in the sub -SELECT is able to use values from the file you are updating — in this case, the field ODORDR comes from the file MYLIB.ORDLINS, which is what you're updating. The result is that your update is based on criteria contained in a file other than the file to be updated.
When performing an SQL UPDATE, there are often times when not only a criterion is based on values from another file, but also the value of the field to be updated. A good example would be updating a total quantity on a header record by summarizing the values from the line details. You can do this by using a sub-SELECT in a method similar to that used earlier. Figure 1.13 shows an example of this type of SQL UPDATE.
This example uses the sub-SELECT as the target for the SET clause. When the statement is run, the value of the field ODOQTY in the file ORDLINS will be summarized for all records where the value of ODORDR is equal to the field OHORDR from ORDHEAD (the file to be updated). Again, you use a value from the "primary" file, ORDHEAD, within the WHERE clause of the sub-SELECT.
It's important to remember that the sub-SELECT used must return only one value for each record in the file to be updated, meaning that there must be a one-to-one relationship between each record in the file to be updated and the value returned by the sub-SELECT. If not, you'll get an SQL0811 error stating that the result of the SELECT is more than one row. The same would be true when using the sub-SELECT as part of the WHERE clause. This is not a problem in this example because the SUM() function summarizes all records within the ODORDR file that match the criteria.
Both methods can be combined to use the sub-SELECT as part of the SET clause and the WHERE clause. This method can be useful when, for example, you want to recalculate an order total in an order header file by summarizing the line details from the order detail file for customers in a specific region based on a value from a customer master file. Figure 1.14, a modified version of an earlier example, performs this task.
The WHERE clause used in this SQL statement may be a bit confusing at first. The sub-SELECT statement will return the customer number from the file CUSTMAST only if the value of CMREGN is 'NE'. Placing this sub-SELECT inside of the IN predicate generates a list of values, which are compared to the value of OHCUSN. The sub- SELECT used with the SET clause summarizes the value of the field ODOQTY for all records where the value of the field ODORDR matches the value of OHORDR from ORDHEAD. When this statement is executed, the order total summary will be recalculated for all customers in the region 'NE'. In this example, we are actually using values from two separate tables in addition to the table being updated.
(Continues...)
Excerpted from SQL Built-In Functions and Stored Procedures by Mike Faust. Copyright © 2005 Mike Faust. Excerpted by permission of MC Press.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.