Fixing Tape Errors

Long term retention

You can get problems with old data which has been stored on tape for many years. Before discussing the physical retention issues, how about starting with software issues? Say you have a requirement to keep a copy of application data for 7 years. You faithfully backup your application to tape, and six years later, you need to download it to check something. In the meantime, your application programs have changed, and so have your databases. You can get the data off tape no problem, but you can't read it in a meaningful way. The answer? Backup your data, and your database definitions, and your application programs, and anything else you might need, like database compression tables. Then, 5 years down the line you could move to a different operating system and the whole lot is useless. The point is that long term backups have to be thought through carefully. It is not enough just to get the data down to reliable tapes.

Avoiding Tape Errors

OK, you've got a good application backup, but you're not sure about tape life. Your manufacturer quotes 10 years, but will EVERY tape last for 10 years? There are a number of reasons why tapes can go faulty. The obvious one is manufacturing errors, but also tapes will go progressively faulty due to chemical reactions between the tape surface, and water vapour and dust particles in the air. Fluctuating storage temperature speed these reactions up. Manufacturers recommend that tapes should be stored at a temperature of about 20 degrees Centigrade and relative humidity of 50%, so the correct storage environment is critical.

Keep your tape store clean. Its always a temptation to use free space in a tape store to hold other equipment. Avoid storing paper or frayed cardboard boxes with tapes, as these are a prime source of airborne dust.

Keep the drive heads clean. Its best to establish a cleaning routine of once per day, or even once per shift if the drive are busy. Robotic tape silos will do this automatically

Monitoring and detecting errors

Occasional read errors are usually corrected internally by the parity bits on the tape. These errors might be reported, either in an external error recording system like IBM's EREP, or internally on the tape. For example, LTO tapes hold error information in the LTO-CM . The trick here is to intercept these errors, and use them to decide that a tape is deteriorating, and the data should be copied off it. On a mainframe you should also use EREP reports to identify faulty tape drives. If you do not do this monitoring, then you might have a lot of old, faulty tapes around.

Another avoidance technique is to occasionally read the tapes, and catch faulty ones before they deteriorate too much. Software products exist to help with this. For example FATS/FATAR from Innovation Software can be used to check that mainframe tapes are still readable.

On final idea is to set up a schedule to copy tapes every 3 years or so. Backup products like DFHSM and Spectrum Protect (TSM) will do this for you, as they reclaim older tapes once the active data on them falls below a pre-defined threshold.

Fixing Tape Errors

Most tape problems thesse days are down to human error, that is, someone has overwritten a tape that was required. If this happens, then it might be possible to recover some of the older data, especially if the new file is quite small. When you write a file to a tape, you write an end-of-tape marker after the file, and normal applications will not read past that marker. However, specialist apps do exist that can read past the end-of-tape marker and recover the older data. OnTrack for Open systems, and FATS/FATAR for mainframe are 2 products that come to mind, but a google search will bring up others. Unfortunately, some of the data will have been lost.

If you get problems reading a tape, you could try reading it on a different drive. If that fails, check the cartridge for physical damage and check your system logs for error messages. Other options to consider are; is the tape encrypted or compressed? If so you will need the encryption key or decompression tables to read it.
You should start by checking with either the company that sold you your drives, or the manufacturer if they are not the same. Ask them for assistance, and check their web sites for FAQs and troubleshooting tips that may help your diagnose your tape errors.

If your tape has deteriorated so far that the oxide surface is peeling off it, then you are past the point of no return. However, if you can't read the data on a tape, and its critical, then it is worth trying the following

An mainframe alternative is to try recovery software like Innovation's FATAR, which will try to clean the tape by moving the faulty spot backwards and forwards quickly under the drive head. If all else fails, FATAR will drop faulty blocks and strip off all the data it can read.

Open Systems issues

Modern LTO tapes are very reliable, they can usually handle 20,000 load and unload cycles. If you are adding your own sticky bar-code labels, make sure they fit inside the recessed area on the cartridge, as if they overlap they can cause problems.
If the leader pin on an LTO cartridge gets dislodged, it is possible to replace it. The recommendation is to get an official leader pin re-attachment kit from your supplier and follow their instructions. Generally, you would just do this to recover the data on the tape, and discard the tape once you have copied it.
The first meter or so of the tape is the leader tape and does not hold data. The next portion is the LTO reference area, and if that becomes damaged, the tape data cannot be recovered.
LTO records data in a serpentine fashion up and down the tape surface, with many tracks spanning the tape. While the tapes are robust, if you do get a physical error on the tape surface, it will affect a lot of tracks and so would most likely mean that the data cannot be recovered.

The tape surface on DAT drives is angled to the tape drive drum. Some products, OnTrack for instance, can re-align the tape if the tracking becomes skewed. OnTrack can also recover a damaged System Area at the front of a DAT drive

back to top

Lascon updTES

I retired 2 years ago, and so I'm out of touch with the latest in the data storage world. The Lascon site has not been updated since July 2021, and probably will not get updated very much again. The site hosting is paid up until early 2023 when it will almost certainly disappear.
Lascon Storage was conceived in 2000, and technology has changed massively over those 22 years. It's been fun, but I guess it's time to call it a day. Thanks to all my readers in that time. I hope you managed to find something useful in there.
All the best