Read an Excerpt
The Minimum You Need to Know to Be an OpenVMS Application Developer
By Roland Hughes
Logikal SolutionsCopyright © 2006 Roland Hughes
All right reserved.
Chapter OneDCL and Utilities We Need
2.1 DCL for Application Development
Operating system command languages usually aren't used for application development. Some operating systems include such a powerful command language that you can develop entire systems in it if you wish. The real drawback is such a system won't be running in compiled mode; it will be interpreted. If your application or system is to be heavily used, it is advisable to look at a compiled language rather than an interpreted one.
DCL (Digital Command Language) is an incredibly robust command interface. You can create indexed files and perform IO on them directly in the language. If you wish to code escape sequences in the command file, you can even do some nice screen formatting. DCL has one drawback with respect to indexed files. It cannot effectively read or write floating point data. Yes, if you know exactly the bit patter you need, it can be done; but it is too confusing for most people to follow in the code. You will understand, once you see what is involved just creating long integers for the primary data file.
I am going to develop an import program completely in DCL. We are going to develop this program within the confines of DCL on the OpenVMS platform for the following reasons:
It allows me to show you some more tools andfunctionality.
If you end up at a client site that needs something written and who doesn't own a compiler, you will have at least some idea of how to go about it.
Some shops use DCL heavily, and will code import routines in DCL rather than compile a program.
2.2 FDL and Our Indexed Files
Please take a moment to glance at our sample application file layouts found in Section 1.5. You will notice that the layout of the stats files are exactly the same. The reason we have two separate files is the input source is different. The second set of stats is calculated only from the column mega_no.
If you happen to know the complete syntax of FDL (File Definition Language), you can simply fire up your favorite text editor and key in a definition. The rest of us need to use the EDIT/FDL utility. It is a good habit to put the FDL extension on any FDL you create. You could name the file FRED.SMITH if you wanted, but it would be hard for someone to find in the wee hours of the morning while answering a production support call.
$ edit/fdl drawing_data.fdl
Parsing Definition File
DKA1200:[HUGHES.MEGA_ZILLIONARE]DRAWING_DATA.FDL; will be created.
Press RETURN to continue (^Z for Main Menu)
After hitting return you will see a menu much like the following:
OpenVMS FDL Editor
Add to insert one line into the FDL definition Delete to remove one line from the FDL definition Exit to leave the FDL Editor after creating the FDL file Help to obtain information about the FDL Editor Invoke to initiate a script of related questions Modify to change an existing line in the FDL definition Quit to abort the FDL Editor with no FDL file creation Set to specify FDL Editor characteristics View to display the current FDL Definition Main Editor Function (Keyword)[Help] :
The option we want is "Invoke". After entering it and hitting return you will see a screen much like the following. I have already chosen Indexed and took the defaults for the two questions.
Script Title Selection
Add_Key modeling and addition of a new index's parameters Delete_Key removal of the highest index's parameters Indexed modeling of parameters for an entire Indexed file Optimize tuning of all indices' parameters using file statistics Relative selection of parameters for a Relative file Sequential selection of parameters for a Sequential file Touchup remodeling of parameters for a particular index
Editing Script Title (Keyword)[-] : Indexed
Target disk volume Cluster Size (1-2Giga) :
Number of Keys to Define (1-255) :
The following is the rest of the edit session:
Key 0 Graph Type Selection
Line Bucket Size vs Index Depth as a 2 dimensional plot Fill Bucket Size vs Load Fill Percent vs Index Depth Key Bucket Size vs Key Length vs Index Depth Record Bucket Size vs Record Size vs Index Depth Init Bucket Size vs Initial Load Record Count vs Index Depth Add Bucket Size vs Additional Record Count vs Index Depth
Graph type to display (Keyword)[Line] :
Number of Records that will be Initially Loaded into the File (0-2Giga)[-] : 1000
(Fast_Convert NoFast_Convert RMS_Puts) Initial File Load Method (Keyword)[Fast] : RMS
Will Initial Records Typically be Loaded in Order by Ascending Primary Key (Yes/No)[No] : yes
Number of Additional Records to be Added After the Initial File Load (0-2147482645) :
Key 0 Load Fill Percent (50-100) :
(Fixed Variable) Record Format (Keyword)[Var] : FIX
Record Size (1-32224)[-] : 32
(Bin2 Bin4 Bin8 Int2 Int4 Int8 Decimal String Collated Dbin2 Dbin4 Dbin8 Dint2 Dint4 Dint8 Ddecimal Dstring Dcollated) Key 0 Data Type (Keyword)[Str] :
Key 0 Segmentation desired (Yes/No)[No] :
Key 0 Length (1-32)[-] : 8
Key 0 Position (0-24) :
Key 0 Duplicates allowed (Yes/No)[No] :
File Prolog Version (0-3) :
Data Key Compression desired (Yes/No)[Yes] :
Data Record Compression desired (Yes/No)[Yes] :
Index Compression desired (Yes/No)[No] :
I did not include the line graph which gets displayed nor did I include the entry of FD to finish the design.
Text for FDL Title Section (1-126 chars)[null] : Data File for mega zillionare application
Data File file-spec (1-512 chars)[null] :
(Carriage_Return FORTRAN None) Carriage Control (Keyword)[Carr] : FORTRAN
Emphasis Used In Defining Default: ( Flatter_files ) Suggested Bucket Sizes: ( 3 3 12 ) Number of Levels in Index: ( 1 1 1 ) Number of Buckets in Index: ( 1 1 1 ) Pages Required to Cache Index: ( 3 3 12 ) Processing Used to Search Index: ( 8 8 10 ) Key 0 Bucket Size (1-63) : Key 0 Name (1-32 chars)[null] : draw_dt
Global Buffers desired (Yes/No)[No] :
The Depth of Key 0 is Estimated to be No Greater than 1 Index levels, which is 2 Total levels.
Press RETURN to continue (^Z for Main Menu)
Notice that I did enter a title. Any time you create an FDL using this utility you should enter some kind of title to let the reader know this was a hand generated FDL and not one generated by analyzing the file. (We will cover that later after we populate the file.)
Since we will also be doing some FORTRAN development later in this book I chose to use FORTRAN control instead of the normal Carriage_Return.
Whenever you create an FDL by hand, either with this tool or by editing an FDL created from analyzing a file, name the keys! You will see I named our key with the field name we will be using. Get in the habit of doing this. Systems on OpenVMS have a tendency to stay in place 20-30 years. When somebody wants you to do something new with this file you need a quick method of looking up what the keys are on it to see if you can get the data. If they ask five years after you originally created the file you will not remember the key structure.
After hitting return and typing EXIT at the resulting menu the process is complete. Here is the resulting FDL file.
TITLE "Data File for mega zillionare application"
IDENT "30-NOV-2004 14:49:30 OpenVMS FDL Editor"
SYSTEM SOURCE "OpenVMS"
FILE ORGANIZATION indexed RECORD CARRIAGE_CONTROL FORTRAN FORMAT fixed SIZE 32
AREA 0 ALLOCATION 108 BEST_TRY_CONTIGUOUS yes BUCKET_SIZE 3 EXTENSION 27
AREA 1 ALLOCATION 3 BEST_TRY_CONTIGUOUS yes BUCKET_SIZE 3 EXTENSION 3
KEY 0 CHANGES no DATA_AREA 0 DATA_FILL 100 DATA_KEY_COMPRESSION yes DATA_RECORD_COMPRESSION yes DUPLICATES no INDEX_AREA 1 INDEX_COMPRESSION no INDEX_FILL 100 LEVEL1_INDEX_AREA 1 NAME "draw_dt" PROLOG 3 SEG0_LENGTH 8 SEG0_POSITION 0 TYPE string
Note the areas highlighted in grey. The title will actually survive with the index file created by this FDL and come out in any FDL files created by the ANALYZE command. Our record size is 32 bytes (six integer fields at four bytes each plus one character field of eight bytes). The name for the key is actually listed in the FDL along with its data size, offset and data type.
We won't create FDLs for the stats files now. There will only be 52 records each in them. No matter how bad the defaults are for an indexed file created from any of the languages on this platform, they will work for that.
2.3 Indexed File Lore
I need to cover some ancient lore about RMS indexed files. We won't be creating indexed files of this type with our simple little application, but as you go out into the world you will encounter multi-typed files at a good many installations. The systems will have been in place a very long time and you need to know how to handle this situation. I will wager many of you were not old enough to shave when these systems were written. If you graduated from what passes for college courses these days you didn't even cover indexed files, only relational model tables.
RMS stores the indexes in the same file along with the data. This is unlike most of the PC file systems you may have encountered. The exact mechanics of this I will not cover, but feel free to locate the Records Management System manuals on-line if you really wish to know the internals of it all. With PC indexed file systems you typically have one external file for each key. If somebody deletes that file, you have to hope there is a utility included to rebuild the index from the data. Unlike the PC indexed file systems RMS actually does record level locking and can manage that locking for every user on the cluster. If you turned on Journaling when the file was created each user can be adding/updating/deleting records from within their own distributed transaction and none of it will actually be written to the file until they commit the transaction.
Many of the so called "multi-user" PC systems are really single user systems. They do page level locking. Even worse, they do page level updates. This tragedy is compounded by the tools which come with them. Because they lock entire pages the tools don't lock anything until the actual write needs to occur. The flaw is they only check for changes in the one record the user wished to change before doing the lock-commit-unlock cycle which rewrites the entire page. The poor schmoe who updated a different record on that same page just a few minutes earlier loses all his/her changes and doesn't even know it happened. An even sadder note is the rumor some of these file systems got ported to other platforms. In an effort to combat this problem some designers change the page size to match their record size (or vice versa) tanking the performance then screaming they need better hardware when what they really need is better design.
If you wish to measure the robustness of the indexed file system you are looking at, you need to look at how many different types of records can be in the same file. Yes, you read that correctly. It is something you will see on many a midrange and mainframe system. Indexed file systems which allow for only one type of record to be in a file typically took shortcuts in their design which will burn you when it hurts the most. (We are talking about indexed file systems here, not fully relational databases.)
When it comes to multi-typed records in indexed file systems, you will find it doesn't matter much who wrote the system or on what platform, they tend to all have the same general layout. We will take a look at some general layouts for customer profile and invoice files.
Key 0: Customer number char 10 Rec_type char 2 Sequence_no char 2 Generic map with filler at the end for some amount.
The very first record in the file will be a customer number of all spaces, a record type of "00" and a sequence of spaces. This is what is known as the control record. The primary field on it will be either the next available or the last used customer number. Each implementation may have several other control fields on here. No new customers get added to this file without first locking this record. The rest of the records types will be much along the following lines:
10 Customer header 11 customer notes uses sequence number 20 bill to address 30 ship to address uses sequence number 31 shipping comments same sequence as ship to One line comment for each shipping address Such as "deliver only on Tuesdays". 40 general credit info
Invoice: Key 0: Invoice number char 10 15 in systems written later. Rec_type char 2 Sequence_no char 2 Sometimes called line number Generic map with filler at the end for some amount.
Once again the first record in the file will have a blank invoice number, record type "00" and blank sequence or line number. It is again the control record. In systems that allow for segmented invoice numbers it will contain beginning, ending and next available invoice number for each range. Many places segmented their invoice numbers to be grouped by general invoices, credit invoices, foreign invoices and foreign credits. It is worthy of note that some systems reverse the order of record type and line number. These systems came about later in life. It is a trade off between insertion speed and generation speed. A few systems even changed the key to be invoice number, record type, sequence number and line or sub-sequence number. It depended upon how many comment lines they allowed with each detail line.
10 Invoice header 20 Bill to information 30 Ship to information 40 Carrier information 60 Invoice detail 61 Detail comment 62 Credit or discount line 70 Credit or discount summary 80 Invoice summary
The 70 and 80 record types are not that common. It depends upon how full the invoice header record was at the time of initial system design. Many of these systems were written as "canned" systems to be sold in customized form to each individual client. You will notice there are quite a few gaps in the record type values. This was to allow for new record types with each customer. On some systems the 40 record can really be the summary record and on other systems the summary information (including line count) was stored on the invoice header.
Most of these systems exploited the ability of RMS to store variable length records in a single indexed file. Every time a record was written to one of these files it was done so with a length supplied. You will find a few of these systems blundered badly trying to improve efficiency. They made the records 512 bytes to match the disk block size. The blunder was that they forgot about the extra byte or more added to the end of each record. Remember when we created the FDL and it asked for "Carriage return, FORTRAN, or none"? Those extra few bytes actually get written with the record to the file so you need to allow for them when trying to size each record to a disk block size. Of course, sizing your records to be exactly one disk block in size only does any good if you have data compression turned off when the file is created.
There was a method to the madness with these file designs. You did one keyed hit to the beginning of the section you needed, then sequentially read a handful or so of records, and you had the complete information. You must remember that back when many of these designs were laid out there was a limit of 12 open IO channels and two of those were taken by SYS$OUTPUT and SYS$ERROR. Once the VAX hardware came out the available channel count increased dramatically, but the systems which were originally designed on PDP hardware recompiled and went on their way under OpenVMS.
Some really good software design came out of systems written to use these file layouts. The concept of IO routines was quickly adopted. This was a bundle of record maps and callable functions which were put into the application library. They had standard interfaces and had all conceivable errors trapped inside of them returning only a single error result and a populated error map when bad things happened. External IO routines dramatically cleaned up the readability of source code. People who learned to code during the last decade probably cannot imagine any other way, but it was a novel concept back then. Most programs had an ugly mass of error handling code immediately following each IO operation which made the code incredibly difficult to read for program flow. You had to have program flow charts back then to have any hope of understanding a module. In fact, it is this very argument that keeps the "ON ERROR GOTO" clause widely used in BASIC programming.
Excerpted from The Minimum You Need to Know to Be an OpenVMS Application Developer by Roland Hughes Copyright © 2006 by Roland Hughes. Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.