- VSAM structures
- VSAM commands
- Performance tuning
- JCL Buffers
- LSR Buffers
- System Buffers
- VSAM parameters
- IAM, a VSAM alternative
- VSAM Recovery
- VSAM RLS, DFSMStvs
The physical process of going out to disk (or tape) for data takes time. The point behind buffering is to avoid going to disk if possible by loading data into memory, which is faster access than disk cache. VSAM has default values for how many buffers to allocate to the data and index components of a VSAM file. However these defaults were worked out when all storage was below the 16MB line, all disks were real CKD, were spinning and slow, and the comms channels were slow too. These defaults have not changed with technology, such as flash storage, RAID FBA disks emulating CKD, PAV and FICON channels. It could be difficult to work out exactly what the best options were, so a number of Rules of Thumb have been developed over the years. However be aware that the old Rules of Thumb that were designed to avoid disk seek time are not valid now.
Modern DASD devices have large cache capacity, and read ahead algorithms will preload up to a cylinders worth of data into cache. This eliminates I-O delay due to seek times and rotational positioning, but there is still a lot of benefit to be gained from I-O buffer tuning. The best way to manage buffers is with SMB (System Managed Buffering) or Batch LSR. If you can't use either SMB or batch LSR than you can manually set the buffers for batch. If you do nothing then Batch uses the defaults, and these are usually not the best. Good buffer tuning can your batch run-time in half.
Buffer optimisation generally is a trial and error process. Add buffers, test the results, then make more changes as necessary until you get the best performance. If you add too many buffers, then your job may start paging and that degrades performance. There will be an optimal value, the difficult thing is to try to find that balance. Data buffers are important for sequential access, and index buffers for random access so you need to make the right choice between them. More data buffers for sequential access, or more index buffers for direct access.
Before adding any extra buffers to a job, consider what impact this might have on allocated memory storage, and if you might get a GETMAIN region abend. The best option is to use REGION=0 as this can use memory below the 16MB line, between 16MB and 2GB, and above the 2GB bar. You have a lot of control over where VSAM batch jobs allocate their buffers. If your application is storage constrained you can usually improve performance by loading the VSAM control blocks and buffers above the 'line'. The available JCL parameters are:
The VSAM unit of data transfer is the CI size, and this is often defaulted to 4096. By default, VSAM provides 2 Data and 1 Index buffers, but these generally do not provide adequate performance.
NSR buffers are best for Sequential access as they perform read aheads. However they don’t work as well with a file that is mostly read with direct access. A starting point for allocating buffers for sequential and NSRs is below, where “#ci” is the number of CIs per control area and “#il” is the number of index levels.
A couple of VSAM definitions that are needed to understand buffereing:
Applications using VSAM can allow more than one process to access a file concurrently. Only one write access is possible, but up to 255 concurrent read requests are allowed. These concurrent accesses are called Strings and the total number of strings allowed is set by the STRNO parameter. You would need to check out an RMF report to see how many concurrent accesses might be required, but STRNO=3 is considered a good default.
The lowest level of a VSAM index is called the Sequence Set. The index records above the sequence set are called the Index Set. As well as the hierachical index structure, each CI in the sequence set point to the next one in sequence, and each index record in the sequence set points to a data component CA. So it is possible to read a file sequentially by following the sequence set.
Do not use the BUFFERSPACE parameter as this sets the buffer sizes for the whole VSAM cluster. Data and index components need different size buffers, so use BUFNI for index buffers and BUFND for the data component buffers. IBM recommends that you optimise buffering for random processing. In this case, you need 1 buffer for each string process, as data buffers are not shared between strings for direct processing. You also need an extra one for system proccessing like splits.
BUFND = STRNO + 1
LSR buffers work best for direct access, as they allow several files to share buffers. You should aim to define enough index buffers to contain the entire index set. The index set buffers are shared between strings, but the sequence set buffers are not. So you need enough buffers to hold the entire index set, and one buffer for each string for the sequence set. To work these out, you also need the file to be loaded with data.
To work out the number of buffers required, first note that each index CI only holds one index record, so every sequence set CI maps to one Data CA. So if you know how many Data CAs exist, you know how many sequence set records there are. A listcat will tell you the CIsize, the CI/CA ratio, the total number of index records and the HURBA or high used byte address, but not the number of Index Set records. If you multiply the CISIZE by CI/CA you get the CASIZE. Divide this into the HURBA and round up to the next integer and that tells you the number of CAs in the file. Subtract that from the number of index records, and you get the number of records in the Index Set.
Finally, the number of index buffers is the number of strings, plus the number in Index Set records. To express this as a formula:
BUFNI = STRNO + (TI - HURBA/( CISIZE*CI/CA ))
TI = Total number of index records of the Index component
HURBA = High-used field from the Data component
All these parameters can be found by examining LISTCAT of the VSAM cluster. Just issue the TSO command LISTCAT ENT(cluster name) ALL
The following JCL can be used to place index data in buffers.
//DD1 DD DSN=aa.bb.cc,DISP=SHR,
I-O rates on a heavily accessed index can come down by 1000% or more, with good buffering.