- Linear Technology
- Helical Scan
- Tape drive comparisons
- Tape Futures
- Tape Error Handling
Open Systems tape usage is almost entirely confined to backup and recovery and long term archived data. Some backup products, TSM for example, work just like a VTS as they backup data to a disk cache then stage if off to tape later. SATA disk is so cheap that backup to disk is a viable option and that will speed up recovery. Many companies are using the Cloud to store backups, and while that data does not end up stored in fluffy white stuff, the physical storage is not your problem anymore. So is there a requirement for Virtual Tape in the Open Systems arena?
The fastest way to take a backup is to take a disk snap-copy, however a snapcopy is dependent on the original disk and if that fails, then the backup is lost too. You could also backup direct to disk as cheap SATA disks can be used to make the process economical. But what if your backup and recovery product does not support disk, or you do not want to expend the effort needed to switch your product from disk to tape? This is where a virtual tape system comes in. A Virtual Tape system will actually write data to disk, but it looks just like tape drives to your backup applications.
There are essentially three types of virtual tape libraries; those that completely eliminate tape by replacing tape entirely with disk; those that augment tape by using a disk cache to virtualise the real tapes, and hybrid products that give you the option to be either disk only, or to use backend tape. These virtual tape types are called VTE (Virtual Tape Elimination), VTA (Virtual Tape Augmented) or VTH (Virtual Tape Hybrid) products in future. Most Open Systems virtual tape systems are VTE architecture.
An advantage of virtual tape is the ability to configure multiple virtual libraries. With a single-library all hosts see one communal library. A single library is easy to manage and tends to performance tune itself and also uses media more efficiently. The downside of a single library is that backup from multiple hosts are mixed together on one tape, and that may not be what you want.
Multiple virtual libraries are more flexible. You can keep backups from individual hosts on separate tapes. Multiple libraries are useful if -
There are quite a few suppliers of open systems virtual tape, the ones below are a sample. A word about VTL capacity. Vendors usually quote a 'logical capacity', by which they mean that this is the equivalent capacity that you would need without deduplication. This is OK, except that the deduplication ratios that they use to calculate this can often be very optimistic.
The Hitachi Protection Platform for virtual tape was formerly called Sepaton, which is "No Tapes" spelled backward. No surprise then that a Hitachi / Sepaton VTL is VTE architecture, a disk only device which can either emulate tape devices, or Symantec's OST disk devices.
The hardware consists of a number of HP Proliant servers running a 64bit Linux Kernel. Each server component is called a node, and the nodes are coupled together into a grid solution with DeltaScale software to provide automatic performance tuning and failover. Nodes can be added to boost performance as required, and each node delivers 10TB/h, so with the maximum of 8 nodes a Sepaton S2700 can handle a backup throughput of 80TB/h.
Host connectivity uses 4*16Gb Fiber Channel or 4*10GB Ethernet, with 2*16Gb Fiber Channel connections to the disk storage
Backup data can be segregated into storage pools to separate different kinds of data. Space reclamation is also managed by DeltaStor and runs continuously.
DeltaStor deduplication software runs concurrently with backups and Sepaton claims that it can provide deduplication ratios up to 100:1 as it processes parallel streams.
ContentAware software checks backup data for type (word, excel etc.) and so picks on data likely to be good deduplication candidates. DeltaRemote software is used to maintain offsite copy of data, just transmitting changed data.
A VTE solution, Data domain storage was designed to be 'the storage of last resort', that is the designers concentrated on ensuring that data was always valid and available, rather than making any concessions to boost performance. Extensive checksum processes guarantee that backup data is the same as that sent from server. The architecture is 'log structured file', so updated data is always written to a new location. Contrast this with standard RAID architectures which require that when a RAID stripe is being updated, some old data must be loaded into cache to recalculate parity, so there is always some possibility of losing older data after a power failure. Data domain eliminates that risk, and the RAID6 implementation protects static data from 2 device failures.
The smaller DD3300 can scale up to usable 32 TB native, with a logical 4.8PB capacacity with deduplication and Cloud support. The data transfer rate of 7 TB/h. The larger DD9800 can hold up to a usable 3PB native, and normal backup rates up to 68 TB/h. These rates can be more than doubled with Data Domain Boost, which is a performance enhancing add-on that offloads some of the deduplication processing onto the clients. Data Domain can also use a Cloud tier which significantly extends its capacity to 150 PB.
EMC claim inline deduplication ratios of between 20 and 30:1 but this is very dependent on the type of data being copied.
FalconStor VTL with deduplication is a disk-based backup solution. A FalconStor VTL can be bought as a software-only option, as an integrated appliance on Dell, IBM or Hitachi hardware (servers plus storage) or as a gateway function on top of existing storage. The gateway function is appropriate for large scale enterprise solutions and consists of multiple FalconStor VTLs in a cluster, often backed by an HDS USP storage subsystem, supporting up to 2PB capacity in a RAID6 configuration. A single-node FalconStor VTL can achieve aggregate backup speeds of 20TB/hour, and it can handle up to 160TB/h backup data with a maximum 8 node cluster.
The FalconStor SIR deduplication engine offers a choice of deduplication options, including inline, post-process, concurrent deduplication, or no deduplication at all.
When used with high-speed protocols such as 16Gb Fibre Channel (FC) and 10GbE iSCSI, FalconStor VTL can sustain deduplication rates of over 5TB/hour per node, and scale up to a sustained deduplication rate of 20TB/hour as cluster nodes are added.
The FalconStor VTL supports a variety of protocols, such as FC, iSCSI, and NDMP, and it provides a plug-in component for Symantec NetBackup and Backup Exec Media Servers that works with the Symantec OST API to integrate with NetBackup. FalconStor VTL supports up to 250,000 Symantec OST images per node and also offers a Fibre Channel (FC) SAN target for Symantec OST.
The VTL also supports fiber channel attachement to physical tape libraries for import and export of tape data.
Starwind is a software defined virtual tape library with relatively small data capacities. The three models are:
VTL-160 with a usable raw capacity of 16TB, expected to be 160TB after Deduplication and Compression.
VTL-320, with 32TB Raw and 320TB effective capacities
VTL-640, with 64TB Raw and 640TB effective capacities
The backend storage is presented as RAID5 by default, with the option to used RAID6, 50, 60, or RAID10. This storage can be extended to various public Clouds, including: Amazon: S3, Glacier; Azure blob: Premium, hot, cool, archive; Backblaze B2; Wasabi; IronMountain IronCloud; The Virtual tapes are presented as LTO and the connectivity is 2x 1 GbE + 2x 10 GbE, with the option to use 25 GbE
Fujitsu has two Open systems tape solutions. The ETERNUS CS800 is open systems only, while the Eternus CS8000 supports open systems and mainframes.
The ETERNUS CS800 is a deduplication backup appliance designed for small and midsized environments. Two models are available, 'Scale' and 'Enterprise'. The backend storage is disk only with from 27 to 315 TB (Scale) and 102 to 1.02 PB (Enterprise) of usable capacity. Fujitsu claims the enterprise model can achieve up to 24 TB/hour in speed target mode, without restricting backup applications or requiring additional software installation. Deduplication and Replication software are included as standard. It emulates up to 256 (Scale) or 512 (Enterprise) DLT or LTO drives and can run disk backups with simultaneous NAS, VTL and OST interfaces. There is an 'export to tape' option for long term retention.
The ETERNUS CS8000 is discussed in detail on the mainframe page, but in short it supports major Unix and Windows operating systems as well as major tape libraries. It combines VTL and the NAS option to consolidate backup, archiving, compliant archiving and second-tier file storage in one appliance.
While Cybernetics specialises in IBM iSeries servers, the iSAN V 6000 VTL is also designed to provide virtual tape functionality to pSeries, Windows, Linux, Unix and MAC servers. The Cybernetics' VTL is a disk based appliance with the option to offload archives to physical tape if required. The appliance uses a cache to initially store the data, which assists when the library is performing parallel operations. All tape drives and libraries are supported for remote offload, and it is also possible to offload to removable USB or eSATA devices.
The iSAN V 6000 Series Virtual Tape Library holds up to 720 TB of raw, native capacity. Backups can write in sixteen concurrent, parallel data streams with up to 4 GB/s speeds. Connectivity options are 1 GbE, 10 GbE, 40 GbE, Fibre Channel, SAS, or SCSI. The standard network interface is four 1 GbE ports, four USB 3.0 ports and two 12 Gb/s SAS ports, with the option to upgrade to 1/10 GbE, 8/16/32G FC, 6/12G SAS
Deduplication is post-process and performed at the byte level. Cybernetics clamims dedup ratios averaging 50:1 for Windows, and as high as 700:1 for IBM midrange devices. iSAN uses AES 256 encryption, with the option to encrypt data at rest on disks and offline data on tape. It is possible to replicate data between two iSAN devices, and then encryption is enabled by default. Replication also uses WAN optimising compression to save on bandwidth between devices.