Open Systems tape usage is almost entirely confined to backup and recovery. Some backup products, TSM for example, work just like a VTS as they backup data to a disk cache then stage if off to tape later. SATA disk is so cheap that backup to disk is a viable option and that will speed up recovery. So is there a requirement for Virtual Tape in the Open Systems arena?
The key is to look at the backup options and see what they can deliver.
Backups traditionally go to tape. This is disruptive to applications and tends to be slow. Recoveries can also be slow. Tape is a serial medium; only one backup job can use the tape at a time and it can be difficult to use up tape capacity effectively. On the plus side it is cheap and easy to take data off-site for disaster recovery. Direct tape backups are suitable for moderate quantities of data where a reasonable overnight window is available.
The fastest way to take a backup is to take a disk snap-copy. It is possible to backup terabytes of data this way in a few minutes. The initial copy happens by creating pointers to the old data. This is explained in detail in the snap copy section. If requested, the snap copy software can continue to copy the data under the covers until a full physical copy of the data exists, but this operation can take several hours. Once a full copy is complete, it is possible to snap copy the data back again for fast recovery. The advantage of this process is the ability to take very fast backups of large amounts of data almost non-disruptively. The disadvantage is the cost of maintaining two sets of disks, plus the cost of snap copy software. Another disadvantage is that only one backup is possible, unless you purchase another set of disks.
You fix that last issue by snap copying data to disk then copying the snap data off to tape at your leisure. FDRinstant is an example of this. Once the data is copied off to tape, the second copy disks are available for another backup. This method can cope with large amounts of data quickly and can store several versions of backups, but it is expensive.
A final way is to backup the data to disk instead of tape. Cheap SATA disks can be used to make the process economical. This is not the same as snap-copy, it requires a backup product that can recognise the SATA drives as valid backup media. It also does not require a full set of disks, but just enough space to take a compressed backup. Unlike tape, it is possible to multi-stream backups to the same set of disks, and
also possible to do multiple restores from a set of disks. But what if your backup and recovery product does not support disk, or you do not want to expend the effort needed to switch your product from disk to tape? This is where a virtual tape system comes in. A Virtual Tape system will actually write data to disk, but it looks just like tape drives to your backup applications.
There are essentially three types of virtual tape libraries; those that completely eliminate tape by replacing tape entirely with disk; those that augment tape by using a disk cache to virtualise the real tapes, and hybrid products that give you the option to be either disk only, or to use backend tape. These virtual tape types are called VTE (Virtual Tape Elimination), VTA (Virtual Tape Augmented) or VTH (Virtual Tape Hybrid) products in future.
An advantage of virtual tape is the ability to configure multiple virtual libraries. With a single-library all hosts see one communal library. A single library is easy to manage and tends to performance tune itself and also uses media more efficiently. The downside of a single library is that backup from multiple hosts are mixed together on one tape, and that may not be what you want.
Multiple virtual libraries are more flexible. You can keep backups from individual hosts on separate tapes. Multiple libraries are useful if -
You run more than one backup application as then you can assign a virtual library to each backup application
You have multiple SAN fabrics then you can configure one virtual library per SAN
Your backup application does not handle SAN sharing then you can assign a library to each host.
the licensing agreement for your backup application is based on drive usage or back end storage and you need to limit available drives or storage. If you dedicate virtual libraries to some hosts you can
optimise license fees
There are a few suppliers of open systems virtual tape, the ones below are a sample.
FalconStor deserve to come first as so many other vendors base their offerings on FalconStor software, including IBM, EMC, SUN and COPAN. FalconStor also manufacture their own full VTL systems with de-duplication, IP based replication, tape management, snapshot, continuous data protection and a global management console.
EMC have been in the market for some time with a system that was based on Clariion hardware and FalconStor software. Version 3 of the VTE Disk Library uses 4Gb Fiber channel internal communications and is integrated with Legato and NetBackup.
Fujitsu-Seimens do not appear in the table below as their CentricStor product can support both mainframe and Open Systems. It is detailed in the Mainframe VTS page. The CentricStor is VTH and is based on a grid architecture
and is very scalable. Several VTLs can be configured across multiple sites, to look like a single VTL. Fujitsu-Seimens have a much larger presence in Europe than North America.
HP offer two classes of VTLs, one aimed at SMBs and an enterprise class. The Enterprise models are based on Sepaton software, while the smaller version use HP's own software.
IBM are strong in the mainframe area, but came to the Open Systems side more recently. the VTA TS5700 series are based on Falconstor software. They do not support de-duplication yet, but IBM recently bought Diligent, which had very good de-duplication software called ProtecTIER.
NetApp also produce VTLs that span entry level to enterprise. They use their own hardware and software throughout, and their Decru encryption is especially well thought of. As yet they do not support encryption but it is one the horizon, as is clustering and full replication.
SUN produce two types of virtual tape, VTLplus which is VTA and based on FalconStor, and VTLprime, which is VTE. The VTLplus is discussed below. The VTLprime has smaller capacity, but full support for FalconStor de-duplication. There is a disappointing lack of real product data on the SUN website.
COPAN is almost a unique product as it produces very high capacity VTE systems based on large arrays of MAID disks. (Managed Array of Independent Disks which spin down if they are not used, making them very power efficient) The software is FalconStor based and capacity can scale to 448 TB, making them very suitable for large archive stores.
Sepaton produces a VTE product that is content aware and relies on de-duplication to make effective use of its 1.2 PB physical capacity. It appears to be very popular with TSM sites.
The following table identifies some of the selection criteria that can be used to pick a virtual tape system, with data for eight of the principle large systems vendors. This data was correct in October 2008
Vendor and VTL Model
Vendors and VTL equipment discussed. Information comes from manufacturers data sheets and was correct in October 2008.
FalconStor
HP
IBM
SUN
Sepaton
COPAN
EMC
NetApp
VTLe
12000 VTS
TS7530
VTL Plus
S2100-ES2
300T/TX
DL 4400
VTL 1400
Virtual Scalabilty
How many nodes does the solution support, and what is the maximum number of virtual libraries, drives and tapes the solution can emulate per node
max 8 nodes
Maximum per node figures
128 libraries
1,024 drives
65,536 tapes
max 8 nodes
Maximum per node figures
16 libraries
128 drives
65,536 tapes
max 4 nodes.
Maximum per node figures
128 libraries
1,024 drives
64,000 tapes
max 4 nodes
Maximum per node figures
256 libraries
1,024 drives
64,000 tapes
max 16 nodes
Maximum per node figures
192 devices, either libraries or drives
331,250 tapes
Maximum figures
8 libraries
56 drives
8192 tapes
Maximum figures
256 libraries
2048 drives
128,000 tapes
Maximum figures
512 libraries
3000 drives
20,000 tapes
Physical Tape capability
Does the solution support physical tape (VTA), is it disk only (VTE) or a hybrid (VTH)
VTA
VTH
VTA
VTA
VTE
VTE
VTE
VTH
Physical Scalabilty
What are the physical limits in terms of disk capacity. You would expect VTE products to have a large disk capacity.
Disk cache scales up to 256TB
Disk cache is 1028 TB
Disk cache scales to 1.7 PB
Disk cache is 630TB
Scales to 1.6 PB
Scales to 634 TB physical, or 8 PB after de-dup
Disk capacity to 1 PB usable
550TB usable
Interoperability
How open is the solution in terms of server support and external hardware support
Supports a large range of tested servers, tape libraries, drives and backup software.
Supports a wide range of servers and backup software. Hardware support and drive emulation limited to HP products.
Supports a wide range of servers and backup software. Hardware support and drive emulation limited to IBM products.
No data
Supports a wide range of virtualised drives and library types
Supports a wide range of virtualised drives, library types and backup software
Supports a wide range of virtualised drives, library types and backup software
Supports a wide range of virtualised drives, library types and backup software
De-duplication
What type of data deduplication does the solution support
Post-process de-duplication called Single Instance Repository (SIR).
Enterprise products use Sepaton's post-process DeltaStor de-duplication software.
No support
SUN states that the VTLplus is 'de-dup ready'
delta-stor de-dup software, currently guarantees 40:1 dedup ratio
post-process deduplication
post-process deduplication as an extra cost option
Non at present, but A-SIS dedup is on the horizon
Clustering, failover and replication
How resilient is the solution? Does it support clustering and automatic failover.
Supports up to 8 nodes in 4 cluster pairs. IP based replication can be one to one or one to many. Replication is de-duplication aware, only unique data is replicated.
Not available October 2008, but on roadmap
4 node cluster with active-active failover and path failover. IP Replication between VTLs.
No data on clustering. Remote IP based replication is supported.
No data on clustering, replication is supported with Side2 software
Clustering with auto-failover. Remote IP based replication is supported
No data on clustering. Remote IP based replication is supported.
No clustering, replication is by manual tape export with manual failover
Encryption and WORM
Security features include Encryption support and WORM capability
Virtual Tape shredding to US defense standards.
No data
Encryption at replication level, and on TS1120 drives. Virtual tape shredding supported
Encryption at tape level, and virtual tape shredding support.
uses Decru encryption
Encryption supported on both internal storage and exported tapes
No data
uses Decru encryption
User management
How do you manage the VTL? Can you manage multiple VTLs from one console?
Up to 8 nodes can be managed as a single group, including single signon, configuration management, central reporting and user management
Central Management
Central web console for all management functions.
No data
No data
Gui Console
Single console interface manages up to 8 systems
Gui console
FalconStor
HP
IBM
SUN
Sepaton
COPAN
EMC
NetApp
VTLe V5
6000 VTS
TS7530
VTL Plus
S2100-ES2
300T/TX
DL 3.1
Nearstor VTL
Of course there are other important criteria to consider when selecting a product, but they will really be unique to your site and situation. Consider :
Price! - no prices given here, be prepared to negotiate
Support - do you have good local support with a second level backup for difficult problems. Ask the vendor tell you about their support and escalation processes
Financial stability - is the vendor financially secure? Your legal people should be able to help here.
Market Presence - how many of these VTLs have been sold? Ask the vendor for reference sites that are similar to yourselves and take the references up.
Roadmap - ask the vendor to share their future plans with you, under a non-disclosure agreement if necessary.