- z/OS file structures
- DFSMS on z/OS
- Data Class
- Management Class
- Storage Class
- Storage Group
- ACS routines
- z/OS performance
- z/OS file utilities
- z/OS SMF statistics
- z/OS RMF reporting
This usually means I-O tuning, as a batch run is usually I-O bound. Be aware that if you optimise your I-O and you have little spare CPU capacity you can replace the I-O bottleneck with a CPU bottleneck so make sure you have adequate spare CPU capacity. You also need adequate CPU memory to avoid paging. Before throwing Storage resource at an I-O problem, see if high I-O rates are caused by program inefficiency. For example
SELECT * FROM tablename , with no WHERE statement.This reads in an entire table - loads of I-O needed!
The main problem with most of the above issues is that fixing them requires Programming resource, which is usually charged out to the customer. If the customer isn't suffering from the problems, he won't want the pay for fixing them! Storage solutions always seem to come free.
If your application uses VSAM files, either native, or as databases, then a number of tuning techniques exist to make them run faster. The article on VSAM buffering gives a few tips on how to do this, including how to use VSAM System Managed Buffering (SMB). An alternative solution is to let a product like EADM or Rocket's Performance Essential do it all for you.
In the old days, about 10 years ago, IO tuning was a major industry, but modern disk subsystems with flash storage fronted by SCM have removed a lot of the art from the job. There are one or two things still to watch out for.
z/OS can only schedule one IO to a logical UCB at a time. If applications need to access the same logical disk, then they will queue up to get access. This is called IOSQ. You can almost eliminate IOSQ by using PAV aliases. See the PAV section for PAV details, and the RMF section for IOSQ details. Even if you have PAV installed, it is worth checking from time to time to see if you have IOSQ problems. If you do, then the answer is to add more PAV aliases.
If you do not have PAV, then you will need to identify all your files that are VERY IO intensive, and try to isolate them onto their own volume. Examples that may need to be isolated include; RACF database, catalogs, DB2 and other DBMS log files.
If you see a lot wait times for disks, especially for write IO, then it is possible that your disk cache is not big enough. Most disk subsystems have software that allows you to monitor this.
Another possibility is that your disk channels are too busy. You can check this out by using RMF reports as explained in the RMF section.
Finally, the performance of some disk subsystems can be affected by the way you mix your data. In general, it is a bad idea to fill up part of a subsystem with the same kind of data. For example, most disk subsystems are physically made up of numbers of RAID arrays. It is best to spread data from each application among these arrays, and not reserve an array for each application.
EADM creates an I/O profile for your mainframe storage, which can be for all your disks, for a given storage pool, control unit, application or single disk. EADM extracts data from existing RMF and CMF files and builds up a history of 'normal' disk activity. It can then very quickly detect and report on abnormal events, and suggest resolutions.
It will report on deficiencies such as
EADM is said to be very easy to install and use. Once set up, reports can be fully automated, take about 5 minutes to process, then the results send to you in Word or HTML format by e-mail.
For more information, Try this link
Other interesting products from Technical Storage include
EAMC (Easy Analyse Mainframe Check) which monitors MSU / MIPS consumption per LPAR.
EATM (Easy Analyse Tape Mainframe) analyses various mainframe tape control files (TMS, TLMS, ControlT, SLS files, MVC files) to build up a normal pattern of tape activity and can alert when capacity is underused, or real capacity in virtualised systems is nearly full.
This tip originally came from Dave Hollingsworth. If you code a JOBLIB in your JCL, z/OS will search for programs even if they are in SYS1.LINKLIB. Dave quotes IEFBR14 as an example, it occurred in 30,000 jobs in a concatenation of 17 joblib datasets in his site, which meant 510,000 unnecessary accesses. They changed their JCL to use STEPLIBs instead of JOBLIBs and shaved 20 minutes off their batch run. Other potential problem programs include SORT and IEBGENER