There have been a few announcements over the years that tape is dead, but the technology is still very much alive in 2012 and seems to be going places. Some of the reasons for why tape's demise was imminent were:
- It can be quite difficult to fill up a high capacity tape in a reasonable time. Take a 4TB tape. It takes 5 hours to fill, writing at 800GB/hour. This is true, but is a limitation of the high data capacity stored, rather than the technology. This is only a problem when recycling tapes where some of the data on the tape has expired, for example DFHSM archives or TSM backups.
- An associated problem is occurs when the data on a tape expires at different rates, and again DFHSM and TSM are good examples of this. If you recycle at 50%, then up to 2TB of capacity is wasted on 4TB tapes. This problem, of course is little different for several smaller tapes, as up to 50% of capacity is still wasted.
- Getting a specific piece of data off a large tape can take a while (I've waited 40 minutes to recover a critical file from the end of a 3480). This is partly because of the fact that a tape was read sequentially from start to finish. This problem has been fixed with linear serpentine tapes, where it is only necessary to read down the track that holds the data, rather then the whole tape.
- Only one task can use a tape at a time. This causes problems when several people want to recall migrated data which is stored on one tape. This is still true, though LTFS promises to make it less of an issue.
- The cost differential between disk and tape is eroding. True, but a TB of data on tape is still significantly cheaper than disk, and once data is on tape, there is little power required to maintain it. This means that the 'green' environmental cost of tape is a lot less than disk.
Tape was always traditionally used for backup, and while disk based backups are available, explosive data growth makes them relatively expensive, even with de-duplication and compression. A suitable backup storage strategy seems to be disk-disk-tape, or DDT. The reason for this is that 95% of restore requests are made within 14 days of a backup, so it might be work keeping backups on disk for the first two weeks, but after that, tape is suitable and cheap.
Tape is also useful for long term archives, and Active Archives. Tape has a number of advantages over disk when used for long term archive. Cost is the obvious one, but less obvious is that since tape reliability has improved by 700% in the last decade, it is now more reliable that disk for long term retention.
Active Archive is a relatively new idea where a number of companies are trying to establish standards for accessing data on different types of storage. The difference between Active Archive and HSM is that whereas with HSM, when a migrated file is required it is recalled back to primary storage, with Active Archive, files are accessed from wherever they reside in the storage hierarchy.
IBM is driving a new initiative to introduce a common file system for tape, called LTFS. This has obvious advantages for Active Archive, as it will make it easier to access individual files on tape. It will also have benefits for long term archiving, as currently most applications use their own data storage formats. If a universal format is agreed, then it should be much easier to read 30 year old data. There is no guarantee that any current backup application would be available 30 years from now to read a proprietary tape format, but if it supports an open source piece of software, like LTFS, then the chance is much higher that the LTFS drivers will still be available.
The improvement that comes with LTFS is that the media is mounted and read by the operating system instead of the application. There have been other attempts at introducing open tape, but because IBM has released LTFS as open source and because it is used at a file system level, it has a better chance of adoption as previous solutions were vendor proprietary.
LTFS combines well with LTO5. LTO5 introduced an ability to carve a tape up into two media partitions. LTFS has a directory that details the contents of a tape. This can be placed in Partition 0, which is small and can be quickly read to allow anyone to see what is on the tape. Partition 1 is larger and is used to hold the real data.
Not only would an LTFS export option be ideal for situations where large volumes of data need to be shared with others, but it could be seen as a best practice for archive data.
Another potential use for LTFS/LTO5 is for large data transfers to the cloud. One of the issues with the cloud is that the internet is not geared up for transferring terabytes of data. Customers who require large data uploads frequently do this by shipping magnetic disks to a cloud data centre for upload. An LTO5 tape that could be read anywhere with a LTFS driver would be an easier and cheaper option.
Where does Virtual tape fit in? Virtual tape consists of virtual tape drives which write to a disk buffer. The disk buffer is then flushed out to tape when enough data has been stored. The Virtual Tape section explains this in detail. Virtual tape can be used to fill tapes, and can even allow concurrent access to 'tape' data when it is in the disk cache. However the down side is that the cost ratio of disk to tape is not as good with virtual tape.