Navigation Bar

Big IAM Advert

VSAM Tuning Parameters

KSDS index levels

A KSDS is a direct access file. Access is through a hierarchical index structure, which can have several levels. The lowest level is called the 'sequence set'. By default, VSAM gives you one index buffer, which is used for the highest index level. For random processing with default buffering, several I-O operations will be necessary to retrieve data, one for each subsequent index level, and one for the data CI. When using modern large cache DASD, there is a high probability that all of the index will be in cache, so I-O times are minimised, but it is still worthwhile trying to keep the number of index levels down to avoid I-O altogether. Index levels greater than 4 should be considered a problem.

Index levels can be determined from the index statistics in a LISTCAT as shown below

	STATISTICS                              
     	REC-TOTAL-----------6	- 6 index records
	INDEX:                         
     	REC-DELETED-------0
	LEVELS-----------------2    - 2 level index, 5 record sequence set
KSDS index levels

The number of records in the sequence set, is the same as the number of control areas in the data component. The bigger the sequence set, the more levels you will have in an index. You can reduce the number of control areas by making them bigger, though they will probably be at maximum size anyway. If you allocate a file in Cylinders, the CA size is 1 cylinder. If you allocate in Tracks, the CA size is the smaller of the primary, or secondary space allocation, up to a maximum of 15 tracks. Reducing the amount of freespace will also mean fewer control areas, but see below.

Freespace in a KSDS

Freespace is specified to avoid CI & CA splits, of which more later. A reasonably common specification is FREESPACE (20 25), which basically means that after initial load, 33% of the file is empty. One of the features of STK's Iceberg, is that you get FREESPACE for 'free', as the empty space is not actually stored. This implies that we can specify large freespace values, and so avoid those CI/CA splits. However, life is not that simple. If large values of freespace are specified, then the index levels can increase, and fewer actual records are retrieved in an I-O operation. Both of these impact random performance, and the second one will impact sequential performance. The bottom line is that you need to analyse the insert activity to your file to determine which freespace parameters are best, A reasonable rule of thumb is ;

few inserts, or most inserts at end of file FREESPACE(0 0)
inserts clustered at points in the file FREESPACE(0 50)
data inserted at random, all over the file FREESPACE(20 25)

CI/CA size

The CI size is the 'blocksize' for a VSAM file. A small CI size will impact sequential processing, a large CI size can adversely impact random processing. A small CI size will also use more disk space. A reasonable rule of thumb is to use a 4K CI for small files, and 8K for large files, with performance assisted by buffering when necessary.

You cannot specify a CA size directly, but a big CA size will reduce the frequency of CA splits, and reduce the number of index levels. The downside of course, is when you do a CA split, you move more data. If a VSAM file is allocated in cylinders, the CA size will be one cylinder. Otherwise, it is the smaller of the primary and the secondary allocation unit.

CI/CA splits

People tend to get all excited, if they see a VSAM file with numerous CI splits and especially, CA splits. With SLED DASD, splits meant extra seek time during sequential processing, as the VSAM dataset was not arranged in sequential order. How relevant is that in RAID boxes? However, when a VSAM file is actually performing a CA split, it has to move half a cylinder of data (assuming the file is defined in cylinders, and so the CA size is one cylinder). This can take an appreciable time, so it is worthwhile trying to avoid splits, by manipulating CI/CA sizes, and freespace. Whether it is worthwhile running regular file reorganisations to remove CI & CA splits is open to question.

IMBED & REPLICATE

These parameters were both introduced to speed up access to KSDS files.
IMBED places the sequence set records on the first track of each CA record, and so speeds up sequential performance, as there is then no need to reference the index dataset.
REPLICATE will replicate as many index records as will fit on a track, to avoid rotational delay. On large cache controllers, neither of these parameters will improve read performance, as the index will almost always be held in cache. However, they will adversely affect write performance, as more data needs to be written, so they should be avoided.
These parameters are no longer supported from z/OS 1.4. Your allocation will not fail if you use them, but they will be ignored. The KEYRANGE and ORDERED parameters are also ignored.

back to top


                           VENDOR SHOWCASE

By entering and using this site, you accept the conditions and limitations of use