Home   Assembler  Javascript    

Writing maintainable code

Maintainable code is designed and programmed so that it does not have to be changed when the attributes of related data items are change, e.g. a constant increases in length, the number of entries in an array changes.

The objectives of maintainable code

Typically data attributes that change are

In rarer cases, the type attribute may change from, say, packed decimal to binary, or from fixed-point to floating-point.

The examples below illustrate independence from length, occurrence, and position, as these are the most common scenarios.

Independence from type is more difficult to achieve, and is usually done by macros incorporated into the system design.

 

Caveat: The code samples below are intended as a starting point for your own use, it is your responsibility to ensure they works correctly in your situation.
For instance, the code below assumes that 0 through 15 have been equated to R0 through R15, whereas sometimes equates are set-up so that 0 through 15 become R0 through R9, RA through RF.

Independence from length

The best way of avoiding length-dependent instructions is not to code explicit lengths.


Here is an example of a length-dependent instruction which would have to be changed if the size of the input area were changed.

 

         TRT
         INPUT(80),TRTTABLE    Scan for separator        

 

If the length attribute of the symbol INPUT is correctly established, for example INPUT DS CL8O, the length value can be omitted from the instruction, which would therefore not have to be changed if the length changed.

In some situations the area defined by INPUT needs to be different lengths in different places. In this case, it should be redefined, so that there are separate symbols for different uses, as in the following example:

 

INPUT    DS 0CL80   Input area
STMT     DS CL72    Statement portion
SEQ      DS CL8     Sequence number

 

The use of the zero duplication factor makes INPUT and STMT occupy the same position within the current control or dummy section.. This could also be achieved by an ORG INPUT immediately preceding the definition of STMT. The definition of STMT makes it unnecessary to code INPUT(72) in instructions that exclude the sequence number.

Even if it is inconvenient or unreasonable to set up symbols with the correct length attributes, it is still possible to avoid length dependence by supplying a suitable symbol or expression as the length modifier in the instruction, as shown in the following example:

 

         MVI OUTAREA,C' '                    Set 1st byte to blank 
         MVC OUTAREA+1(L'OUTAREA-l),OUTAREA  Blank remainder

This would be valid for any length of OUTAREA from 2 to 257. Even though the MVC instruction is rather long, it is preferable, in this case, to confusing the data description by adding an extra symbol for OUTAREA+1.

If there is ever a possibility that a length will change to something outside the range of a single instruction, special coding should be included to cover it. This would either be in the form of a loop, repeating a storage-to-storage instruction such as TRT, or the use of the instructions MVCL and CLCL. In both cases, length independence could be achieved by referring to the length attribute of the symbol used in the data area definition, as the following examples show:

 

         LM   R0,R3,MVCLCONS  Load address,length,fill char
         MVCL R0,R2           Blank out output area
*
* The following DC's must be in the order given
*
MVCLCONS DC A(OUTPUT)    Address of output area
         DC A(L'OUTPUT)  Length of output area
         DC A(0)         Dummy 'FROM' Address
         DC C' '         Blank fill character
         DC AL3(0)       Zero 'FROM' Length
*
* End of MVCLCONS DC's
OUTPUT   DS CL250        Output area

 

Performance Note:

The more complicated instructions such as MVCL do have a very high initialisation overhead so it is not wise to use a MVCL in a tight loop if it is only going to be moving a few bytes on average.

 

So far, only length-independence of instructions has been discussed. However, it often happens that two or more data items are related. For example, if OLDSEQ is a field in which previous values of the field NEWSEQ are stored, the two fields must always be the same length. If the absolute value of this length is coded only once, then only one statement will have to be changed if the length changes, as shown in the following example:

 

NEWSEQ DS CL8            New sequence number
OLDSEQ DS CL(L'NEWSEQ)   Previous sequence number

 

Note that the definition of NEWSEQ must precede that of OLDSEQ.

It may be neater to define the length in a separate EQU statement, as follows:

 

 

SEQLEN   EQU 8            Length for sequence number
NEWSEQ   CL(SEQLEN)       New sequence number
OLDSEQ   DS  CL(SEQLEN)   Old sequence number

 

In this example, note that SEQLEN must be predefined. The positioning of EQU statements is more flexible than that of DC statements.

 

A similar technique could have been used for the first example at the head of this section. If the semantic interpretation is "the last eight bytes of the input area are reserved for the sequence number, the remaining forming a statement", the area could have been defined as follows:

 

INPUT    DS 0CL80               Input area
STMT     DS CL(L'INPUT-8)            Statement portion
SEQ      DS CL(L'INPUT-L'STMT)       Sequence portion

 

A change to either the entire input length, or the length of the sequence number portion, will require a change to only one statement.

 

A common case of related data items occurs when an internal number is formatted for printing. There will have to be a packed decimal field containing the internal number, an edit word to match this field, and an output area. The three fields could be coded as:

 

INTNUM   DC PL3'0' 
EDITWORD DC X'402020202120'
OUTPUT   DS CL6

 

However, if five digits were found to be insufficient, all three definitions would have to be changed. Such length dependencies can be avoided, as follows:

 

* The following defines a packed decimal number
* of a certain length, and an edit word and output area
* for that number, in such a way that
* definitions are still valid if the length
* has to be changes
*
NUMLEN   EQU  5                 No. of digits required
INTNUM   DC   PL(NUMLEN/2+1) '0'  Packed version
EDITWORD DC   C' '              Fill char
DC(NUMLEN)X'20' Digit selector bytes ORG *-2 Overlay penultimate byte DC X'21' Significance start ind. ORG , Reset location counter
OUTLEN EQU *-EDITWORD Length needed for output OUTPUT DS CL(OUTLEN) Output area

 

In practice, OUTPUT would be embedded in a print-line description. The contents of INTNUM can now be formatted by:

 

         MVC OUTPUT,EDITWORD Editword to output area
         ED  OUTPUT,INTNUM   Format internal number

 

The equated value of NUMLEN can be changed to any odd number less than 32. If it is changed to an even number, the data definition is still valid, although extra instructions would be required in the formatting, in which case, independence could only be achieved by writing a macro to cater for odd and even lengths.

 

Note that the definition of OUTLEN gives a symbol equated to the length of a composite data area. A decimal point or minus sign could be added to the edit word, without having to change the definitions of OUTLEN and OUTPUT.

 

An alternative to the above method is to write a macro that uses arithmetic variable symbols to calculate consistent length modifiers.

Independence from number of occurrences

If an item of data is repeated to form an array whose elements can be processed individually, by a controlled loop, there should be a symbol equated to the number of occurrences.

 

In the following example, which might appear in a statistical program, there is an array of half-word counters. During execution, these counters have to be added into a corresponding array of grand totals, which occupy full-words, then reset to zeros. The program contains loops to do this. The applications is such that additional counters may be required for future enhancements.

 

N        EQU 30       Number of counters
COUNTERS DC (N)H'0'   The actual counters
ACCUMS   DC (N)F'0'   An accumulator for each counter
*
* Loop to process each element in array
*
         LA R0,N      Set loop controller
LOOP     DS 0H
         BCT R0,LOOP  Repeat N times (Processing loop)

 

Note that these arrays are simplified by having only one data item in each array element.

If the amount of space to contain the array is fixed, so that the number of entries varies with element size, the occurrence count can be coded as in the following example:

 

ARRAY    DS  CL360              Space for array
         ORG ARRAY              Back to start of array
ELEMENT  DS  CL18               Space for first element
COUNT    EQU L'ARRAY/L'ELEMENT  Occurrence count        
         ORG                    ORG past array   

 

Note the use of ORG to reset the location counter. Forgetting to code this ORG or deleting it during maintenance changes is a frequent cause of errors.

 

Another occasion when symbolic duplication factors can be useful is when save areas are to be used by subroutines for storing a range of registers. If registers 3 through 7 are to be used by a subroutine, they would have to be saved on entry, and restored on return to the caller. There may also be a need for more registers when changes are applied during testing, or future enhancements. This can be based on two EQU statements, as follows:

 

SUBRTN   DS  0H
         STM LOWREG,HIGHREG,SAVEAREA Save registers
*
*   ... subroutine code goes here ...
*
*
         LM LOWREG,HIGHREG,SAVEAREA   restore registers
         BR  R7
LOWREG   EQU R3                  Lowest register saved
HIGHREG  EQU R7                  Highest register saved
SAVEAREA DS (HIGHREG-LOWREG+l)F  One word for each 
*                                register saved 

 

This is a justifiable violation of the standard that all register names should include the register number. Note, however, that the names are equated to R3 and R7, rather than 3 and 7, so that the EQU statements are cross-referenced. Furthermore, the symbols LOWREG and HIGHREG should not be used other than in connection with the saving and restoring of registers.

Independence from position

As with length and Occurrence, positional independence is achieved by avoiding 'hard coded' numbers. This means that more symbols have to be defined.

Position-dependent instructions such as:

 

    MVC HEADING+76(4),=X'40202020'

 

could becomes much more readable if a suitable structured definition of the page heading area is coded as follows:

 

    MVC PAGENO,PGNOEDIT Put edit word in page number field    

 

Note that the two length dependencies have been removed, as has the position dependence.


In a case like this; where the data area in question is internal to the program, the best way to avoid dependent code is to write all the instructions first, without coding any explicit lengths, or positions, or occurrence counts, and then code the data definitions to suit. If 'public' data areas are involved, everything must he worked out at design level. It may be worth coding extra symbol definitions, even if they are not immediately required.

 

Positionally dependent data definitions can occur when a descriptive mapping, in the form of a DSECT, is provided for a table of constants. This usually happens when a system-wide control block is set up as a CSECT, or part of a CSECT, and is referred to by other modules as a DSECT. Positional dependence is avoided by making the same macro generate the definition statements, either as a DSECT, or as a CSECT, depending on what is required.

 

Table-driven decoding routines sometimes use tables of constants in which each entry contains parameters associated with the corresponding input item. A DSECT is made available that gives a generalised description of entry. It is therefore important that entry formats correspond to the DSECT format. In particular, it should be possible to change the entry format without having to recode the entire table.

 

To illustrate this, consider standard keyword input. Each input item is identified by a keyword, followed by an equals sign (=) , followed by the item, for example:

 

   DSNAME=THISFILE
*
* The decoding routine might use a table of constants as:
*
DSN DC CL8'DSNAME'    Keyword
    DC FL1'l'         Minimum length
    DC FLI'44'        Maximum length  
    DC AL4(DSNRTN)    Addr. routine to handle value
    DC CL8'UNIT'      Keyword
    DC FL1'3'         Minimum length
    DC FL1'22'        Maximum
    DC AL4(UNITRTN)   Addr. routine to handle value
* etc..
*
* The corresponding DSECT would be
*
KWTAB    DSECT    DSECT FOR KEYWORD TABLE
KEYWORD  DS       Keyword
MINLEN   DS       Minimum length
MAXLEN   DS       Maximum length
RTNADDR  DS       Routine to handle value
ENTRYLEN EQU  *-KWTAB  Entry length       

 

If any of the lengths have to be changed, or if new parameters have to be added, or if the order of parameters has to be changed, the whole table will have to be recoded.

 

The following example shows how the table itself can be coded independently, given the above DSECT:

 

DSN    DC XL(ENTRYLEN)           Dsname entry
       ORG DSN+KEYWORD-KWTAB     Get to right position
       DC CL(L'KEYWORD)'DSNAME'  Keyword
       ORG DSN+MINLEN-KWTAB     
       DC FL(L'MINLEN)'1'        Minimum length
       ORG DSN+MAXLEN-KWTAB
       DC FL(L'MAXLEN')'44'      Maximum length
       ORG DSN+RTNADD-KWTAB
       DC AL(L'RTNADDR)(DSNRTN)  Routine to handle it
       ORG DSN+L'DSN             ORG to next entry
UNIT   DC XL(ENTRYLEN)'0'        UNIT entry
       ORG UNIT+KEYWORD-KWTAB
       DC CL(L'KEYWORD}'UNIT'    Keyword
*      ...  etc.

 

Note that any new parameters that have to be created would have a default value of binary zeros, which might avoid them having to be added to each entry in the table.

 

A major disadvantage of the above is that it takes a long time to read and understand because "the wood is obscured by the trees". However, the reader is going to be more interested in the executable instructions and the DSECT, than in the definition of constants. Furthermore, readability could be improved by writing a special macro to generate the DSECT in one call and the entry definitions in subsequent calls, using parameters to specify the constants, allowing for individual default values.

Using symbols sensibly

Having demonstrated the advantages of using symbols to achieve independent code, it is important to note that the same symbol should not be used for unrelated purposes. In an earlier example, SEQLEN is defined as being 8 bytes long. Therefore, it would destroy the whole purpose of the exercise to code a statement, such as:

 

DWORD DS XL(SEQLEN)  2-Word work area

 

if DWORD is in no way related to NEWSEQ or OLDSEQ, since a programmer changing the lengths of NEWSEQ and OLDSEQ would also have to change the definition of DWORD.

 


If that is a bit abstract then treat yourself to something lighter on the assembler hints and tips page, or maybe try the macros page.