The term “sequential file” refers to the manner in which the records of a file will be processed, and to a lesser extent, the way in which the file records are physically organized on some media.  For input, a sequential file is usually read starting from the first record, proceeding to the second record, then the third, and continuing in this fashion to the end.  For sequential output files, we will write out the first record, then the second, then the third, proceeding in this manner to the last record.  The most commonly used sequential file for IBM mainframes is a QSAM file.  QSAM is an acronym for “queued sequential access method”.  In this topic we investigate how QSAM files are created and processed.  We will also learn about record blocking as well as “locate” and “move mode” input and output.

 

Defining a QSAM File (Queued Sequential Access Method)

 

   QSAM files are defined inside a program using IBM’s DCB macro.  This macro generates a  block of storage called a “data control block” which contains information that is used by the operating system when processing the file that the macro defines. The macro is non-executable, and serves only to generate a control block at assembly time.  Being non-executable, the macro is coded in the program at a point which would not become part of the execution sequence.  Many programmers choose to code the DCB just after the executable portion of the program, and before the declarations for variables.  This is a fairly safe location and serves to keep the DCB’s from becoming corrupted by the program accidentally.  At run time, the information in the DCB is combined with information in the job control language data definition statement (DD), as well as information in the data set label in order to complete the information that is stored in the DCB.  The data set label is a control block that is created and stored with the file when it is created.  Later we will investigate how the information from the DCB, the DD statement, and the data set label are combined at run time.

 

   Lets first look at a sample DCB macro as it might appear in a program.

 

                 CUSTFILE DCB  DDNAME=CUSTOMER,                    +

                               DSORG=PS,                           +

                               LRECL=80,                           +

                               MACRF=(GM),                         +

                               RECFM=FB,                           +

                               EODAD=ENDFILE            

 

By coding a DCB, we are defining a file and its characteristics.  In the DCB above, the file’s internal name, the name that will be used inside the program, is “CUSTFILE”.  Whenever the program references the file, this is the name that will be used.  For instance, to read a record in the file we might code “GET  CUSTFILE,MYREC”.  The internal name appears in the first 8 columns of the macro.  This is followed by the macro name, “DCB”.  The rest of the macro is a sequence of keyword parameters which may be coded in any order:

 

1)  DDNAME -  This parameter assigns the file a “second” name which appears in the DD statement in the JCL that is used to execute the program.  The following is typical of the DD statement that would be part of the JCL.

 

            //CUSTOMER  DD  DSN=TSYSAD2.PRIVATE.DATA,DISP=SHR

 

Notice that the word “CUSTOMER”, which we will call the “JCL name”,  appears in the name field of the DD statement and is used to associate the JCL name with a “physical file name”.  The physical file name is the actual file name as it might appear in the system catalog.  As a result, we have three names for the file we are processing:  an internal name, a JCL name, and a physical file name.  The purpose for having three names is to provide some “indirection” in the file names so that our program is not tied to a single physical file.  By changing the DD statement we can change the physical file that the program references.  This is illustrated below. 

 

       //CUSTOMER  DD   DSN=TSYSAD2.PRIVATE.DATA1

       //CUSTOMER  DD   DSN=TSYSAD2.PRIVATE.DATA2

 

 

Only one of the DD statements above would appear in the JCL.  If the first was coded, the connections in the upper path would be indicated.  If the second DD was coded, the connections in the lower path would be indicated.  As you can see from the diagram, changing the JCL DD statement allows the program to easily process different physical files.

 

2)  DSORG -  This keyword parameter stands for “data set organization” and is used to control the basic structure of the file.  In this case, PS means “physical sequential” and serves to identify the file as a QSAM file.  The records in the file are stored and processed sequentially.  Records which are “logically”  in sequence are also physically stored in sequence on the disk.

 

3)  LRECL -  The “logical record length” is the number of bytes in the record structure defined by the programmer and processed by the program. 

 

4)  MACRF -  This parameter determines the format for the I/O macros that will be used to process the file and the mode in which the I/O will occur.  For QSAM files there are four possible values that can be coded:  GM, PM, GL, PL.  The “G” in the parameter value means that the GET macro will be used for accessing the file.  This implies the file is an input file and already exists.  The “P” means the PUT macro will be used to process the file.  In this case the file is an output file.  The second letter indicates the mode in which the I/O will occur.  “L” means locate mode I/O and “M” means move mode I/O.  These two modes will be discussed later in this topic.

 

5)  RECFM -  The RECFM parameter determines whether the logical records the program processes are fixed in length (F), or of variable lengths (V).  This parameter also controls whether the records are blocked (B) or unblocked.  Here are the typical values for this parameter and their meanings.

                                    RECFM = F        Fixed size records, unblocked

                                    RECFM = FB      Fixed size records, blocked

                                    RECFM = V        Variable size records, unblocked

                                    RECFM = VB      Variable size records, blocked

 

The concept of record blocking will be discussed later in this topic.

 

6)  EODAD -  The “End of Data” parameter is only coded for input files (MACRF=GM or GL).  This parameter provides a label to which the program will automatically branch when the “end of file” condition occurs.  The operating system detects the “end of file” condition while executing a GET macro when there are no records in the file left to be read.  Upon detecting “end of file”, the operating system transfers control to address coded on the EODAD parameter.

 

Opening a QSAM File

   

   Before you can process any records in a file, the file must be “opened”.  The open process causes the “empty” fields in the DCB to be filIed in so the file can be processed correctly.  It is

helpful to understand how information in the DCB is constructed.  This process is illustrated in the diagram below.  Some of the parameters are impractical but serve to illustrate how the information is combined.

 

At assembly time, the DCB is created and initialized with information contained in the program’s DCB macro.  When the job is submitted for execution, the JCL is initially scanned by the Job Scheduler, and a Job File Control Block (JFCB) is created  first using the information found in the DSCB if the file already exists, and then overwritten with any parameters found in the DD statement.  The file is opened at run time, and any information that is missing in the DCB  is supplied by the information in the JFCB.  Because of the order in which the information is combined from the DCB macro, the DD statement, and the DSCB, the most important information is that which is placed in the DCB as a result of the parameters coded in the program’s DCB macro.  This information is supplemented by information gleaned from the DD statement in the JCL.  The only information that is taken from the DD statement and stored in the DCB, are those fields which were not supplied in the program DCB.  In other words, if a parameter is supplied in the DCB and on the DD statement in the JCL, only the DCB parameter is used.  Finally, the only information that is gathered from the data set label is that which was not found in the DCB macro nor in the DD statement.

 

   The OPEN macro has the following format,

 

                  OPEN   (dcb-address,(processing option))

 

          dcb-address -  The label in the name field of the DCB macro.

 

          processing option - 

  

              INPUT -      An existing data set is to be used for retrieving records.

              OUTPUT -  A new data set is being created or the records in an existing file will be

                                replaced by the records that will be written by the program.

              EXTEND -  Records will be added to the end of an existing file.

              UPDAT  -   An existing data set is to be used for retrieving records.  Additionally,

                                existing records can be modified.

 

For example, the following statement opens the file called “CUSTFILE” for input processing.

 

                  OPEN  (CUSTFILE,(INPUT))

 

CLOSING A QSAM FILE

 

   After a file has been processed, it should be “closed”.  This process logically disconnects the program from the file.  During the close processing, the program DCB is reconfigured with the parameters it initially contained at assembly time.  This means the file can be opened again for further processing.

 

   The format of the CLOSE statement follows below,

 

                  CLOSE  (dcb-address-1,dcb-address-2,...)

 

          dcb-address-n -  The label in the name field of the DCB macro.

 

The two statements below are examples of how the CLOSE statement can be coded.

 

                  CLOSE  (CUSTFILE)

                  CLOSE  (CUSTFILE,MASTFILE)

 

In the first CLOSE statement, the CUSTFILE is closed while the second CLOSE statement closes two files with one statement.

 

 

 

 

 

 

Record Queuing 

 

   The “Q” in QSAM stands for the term “queued”, and refers to the queuing of records that occurs during input and output processing.  On the input side, records are brought from an external source (disk, tape,...) into the main memory of the machine.  The records are delivered into storage areas called “system buffers” where they reside until they are retrieved by the program using a GET macro.  The queuing process begins when the file is opened with records being delivered to the system buffers even before the first GET is executed.  During execution of the program the operating system tries to keep the system buffers full of records so that the program will not have to wait for a record to be retrieved from the external device.  This has the effect of speeding program execution.

 

   During output processing, records that are produced by the program using the PUT macro are also placed into system buffers where they reside until the operating system can retrieve them and transfer them to an external storage device.  By “buffering” the output, the program can continue execution without waiting on the external device.

 

   Input and Output queuing is illustrated in the diagram below.

 

 

   As depicted above, the diagram illustrates “move mode” input and output.  The term “move mode” refers to the process of moving a record from a system input buffer to a program input area or from a program output area to a system output buffer.  These moves occur as a result of coding GET and PUT.

 

Reading and Writing QSAM Records in Move Mode

 

   After a file has been opened for input, the records are available for retrieval.  To read a record you must code a GET macro.  There are two formats for this macro depending on how the MACRF parameter is coded in the file’s DCB macro.  Coding “MACRF=GM” determines that records will be retrieved in “move mode”.  This means that when a record is read, a copy of it will be delivered to a storage area defined by the programmer in the program.  This storage area is called a “buffer” and is designed to reflect the contents of the record.  Here is an example.

 

                  GET   CUSTFILE,CUSTREC

                  ...

         CUSTREC  DS    0CL80

         CUSTNAME DS     CL40

         CUSTINF1 DS     CL20

         CUSTINF2 DS     CL20

 

The GET macro above names the file as its first parameter and the program buffer area as its second parameter.  Since we are assuming move mode input, a record is delivered from a system buffer to the storage area called CUSTREC.

   For output processing, the PUT macro is used for writing records to a file.  The MACRF parameter determines the mode in which records will be processed with MACRF=(PM) indicating move mode input and output.  An example  PUT macro is listed below.

 

                  PUT   MASTFILE,MASTREC

                  ...

         MASTREC  DS   0CL100

         MASTID   DS    CL8

                  ...

 

First the record that is to be recorded on the file is created in a program area called MASTREC.  All the fields of the record would be initialized with appropriate values.  When the record has been constructed, it is written to the file by executing the PUT macro.  The first parameter in the macro is the DCB name of the output file.  The second parameter is the buffer containing the record.  Execution of the macro causes the information in MASTREC to be transferred to a system buffer where it will be processed at a later time.

 

Reading and Writing Records in Locate Mode

 

   When MACRF=(GL) is coded, input processing will occur in “locate mode”.  Processing in this mode is more efficient than in move mode since the records we read are never transferred directly to the program’s storage area, but instead are left in the system buffers. For files with large numbers of records, or large record sizes, the processing time that is saved by using locate mode rather than move mode, can be substantial.  If the records are not transferred directly to a program buffer, how can the program access the information in a record?  The answer is that the programmer must use a DSECT to reference the storage. (See DSECTS.)  The GET macro takes an alternate form for locate mode processing.  In this format, the macro has a single parameter which is the file DCB name.  A sample locate mode GET is coded below.

 

                  GET   CUSTFILE

             

After executing the macro, the operating system initializes register one with the address of the record that was delivered as a result of the GET.  This address will be inside a system buffer.  Providing access is a simple matter of loading the address of the record into the register which is associated with the DSECT.  This is illustrated below.

 

          CUSTREC DSECT

          CUSTBAL DS    PL4         CUSTOMER BALANCE

                  ...               OTHER DSECT FIELDS

 

                  USING CUSTREC,R5

                  GET   CUSTFILE

                  LR    R5,R1         MAKE R5 POINT AT THE RECORD

                  ZAP   TEMP,CUSTBAL  PROCESS THE FIELDS IN THE RECORD

                  ...

 

After the GET is executed, the address of the delivered record is placed in register one.  Subsequently, the address is loaded into register 5 by the LR instruction.  The USING statement provides the association between the DSECT name and register 5.  With register 5 loaded with the appropriate address, addressablility to the record is established with the names in the DSECT.

 

   Locate mode output is indicated by coding MACRF=(PL) in the DCB macro.  To write a record to a file, the PUT macro is executed first.

 

                  PUT   MASTFILE

Executing the PUT causes the operating system to place the address of an available buffer in register one.  Using the LR instruction, this address is then copied to a register that controls an output DSECT.  Once addressability has been established to the output record, the record is created by the program.  The record remains available for further program processing until the next PUT is issued, or the file is closed.  The following code is typical of locate mode output.

 

               MASTREC   DSECT

               MASTNO    DS     CL5     CUSTOMER NUMBER

               MASTBAL   DS     PL5     CUSTOMER BALANCE

                         ...

                         USING  MASTREC,R6

                         PUT    MASTREC

                         LR     R6,R1    MAKE R6 POINT AT EMPTY REC

                         ZAP    MASTBAL,BALPK   REC FIELDS AVAILABLE

                         ...

 

Keep in mind that register one is a “volatile” register and is subject to change when executing a system macro or calling another program.  Be sure to make a copy of  register 1 immediately after executing PUT or GET. 

 

Record Blocking and Deblocking

 

   The term blocking refers to the operating system process of combining multiple logical records into larger physical records called “blocks”.  A logical record is the record structure defined by the programmer and consists of a collection of related fields that logically belong together.  A physical record is a collection of logical records which have been combined for the purpose of storing them efficiently on an external device like a disk or tape drive.  Records are blocked because the process of accessing externally stored data is expensive in terms of cpu time.  For instance, in the time it takes to move a disk arm, hundreds of thousands of instructions can be executed by the cpu.  For efficiency, rather than returning a single record when a program requests a “read” operation, the operating system delivers an entire block of records from disk to memory.  The process of separating records from a block and delivering them individually to a program is called deblocking.

 

   The choice of blocking or not blocking a group of records is made by the programmer when coding the RECFM parameter.  Choosing RECFM =FB or VB selects the blocked format.  In practice, most files are blocked.  The exception is made for files with large records containing thousands of bytes.  In the pictures above, the system buffers correspond to blocks from which individual records are delivered to the program either in move or locate mode.

 

   The programmer can also control the size of the blocks in a file using the BLKSIZE parameter coded in the program’s DCB.  For example, if the programmer has coded LRECL=80 and BLKSIZE=8000, then each block will contain 100 logical records.  Computing an optimal block size requires a knowledge of the device on which the data is recorded and is beyond the scope of this discussion.  For IBM’s ESA operating system, block sizes will be computed automatically if the BLKSIZE parameter is omitted in the DCB and on the DD statement when the file is created.