Cloud computing - is this finally the end of the line for Tape?
Have you heard about the new backup acronym D2CD? It stands for disk to cloud disk and is part of a new paradigm called cloud computing. In August 2008 IBM were telling their customers not to use tape for backup anymore as they announced a three hundred dollar investment in cloud centers. It looks like cloud could replace two of the main functions of storage management; backup and recovery and storage provision. Is this the end of Storage as we know it? So what is cloud computing and who is it targeted at?
Cloud is all about delivering computing as a service, where the customer just focuses on the service and does not need to worry about the technical implementation behind that service. Cloud at present typically consists of Server, Storage and Network resources. Cloud is supposed to save money as customers just pay for the resources that they are physically using. It should also make IT more responsive to the needs of the business, as new services or capacity can be added quickly.
It appears that there are two types of cloud. 'Public' clouds are typically made available via the Internet and may be free but are limited in capacity and use and the customer usually lacks control.
'Private' clouds cost money, but in return they offer almost unlimited network bandwidth, availability and capacity. They also have better security facilities that public clouds and the customer has a lot more control over the facilities.
Cloud suppliers
Several people are offering cloud computing services, including IBM, EMC, Amazon and Google. The first three are discussed here.
IBM
IBM is preparing to tell its customers to stop using tape and backup their data to 155 cloud-based IBM Business Resilience data centers.
IBM bought out Arsenal Digital Solutions in December 2007, and is basing its backup services on their software. It is calling its new backup to cloud disk offering IBM Information Protection Services. The combination of the IBM storage appliances and Arsenal's software is called a Data Protection Vault.
Business customers will be able to store and retrieve backed up information from their vault in the cloud for purposes of file restoration, general business continuance, compliance audits and disaster recovery. Data can be restored to the originating site or an alternate one. There will be a pay-as-you-go subscription service. IBM plans to make its "Resilient Cloud" program fully available in early 2009.
The Data Protection Vault is intended for servers, PCs and Laptops and provides flexible retention policies, long term archiving, fast backup and recovery that does not depend on tape, and support for security and compliance.
EMC
EMC runs an enterprise class platform called Fortress, on which it plans to run a number of applications called SaaS (Software as a Service). The first SaaS application is MozyEnterprise, intended for online backup of desktops, laptops and remote Windows servers.
MozyEnterprise automates secure online backup and recovery over the Internet for consistent and reliable off-site data protection for remote desktops, laptops and branch office servers. EMC claims to have more than 500,000 business and consumers using the Mozy technology to back up their information, including General Electric, Pariveda, NTG Systems, Vanderbilt University and Free The Children.
MozyEnterprise is available either through EMC's direct sales organization or through a number of resellers. EMC pricing is based on a per-client basis, with a desktop or laptop costing $5.25 per month per desktop/laptop plus $0.70 per month per gigabyte protected and Windows Servers $9.25 per month per server plus $2.35 per month per gigabyte protected. So a 1TB windows server would cost just under $30,000 to protect for a year. That is 1TB of actual user data of course.
Amazon
Amazon offers two cloud components; EC2 and EBS.
Amazon EC2 (Elastic Compute Cloud) is a virtual server provision service.
Amazon EBS (Elastic Block Store) provides storage volumes that can be between 1G and 1TB in size that are normally attached to EC2 devices. Amazon splits the world up into Availability Zones and a EBS store must be attached to an EC2 device that is in the same zone. Each volume is also automatically replicated within the same Availability Zone to cope with physical device failure, and you can optionally take space efficient snapshots of those volumes for backup and recovery purposes, but again, the snapshots must be in the same Availability zone.
These volumes behave like raw, unformatted block devices, with user supplied device names and a block device interface. You can create a file system on top of Amazon EBS volumes, or use them in any other way you would use a block device (like a hard drive).
In November 2008, Amazon was quoting prices for EBS at $0.10 per allocated GB per month and $0.10 per 1 million I/O requests you make to your volume.
As an example, a 1TB disk averaging 200 IO/s would cost about $1,200 a year for the disk and $6,300 for the IO, giving a total annual cost of $7,500
Problems with the cloud
Data security
The cloud is not a perfect system. The cloud can burst, and you still need to make sure your data is getting backed up. Ylastic resells Amazon cloud storage and recently lost some of its customers' data. When they tried to recover it, they discovered that the EBS snapshotting had not worked for some weeks. They had to recover to an older snapshot, losing some customer data in the process. This is just a reminder that the customer cannot just ignore what is happening in the cloud; they need to make sure that processes are working.
Data Protection
There are also legal implications around where backup or hosted data is stored. Data on EU citizens is not supposed to leave the EU, so a potential cloud customer will need to seek assurances that this is not happening. There is also implications around entrusting your confidential data to a third party, and relying on them to look after it for you.
What about the future?
At present, it looks like cloud computing is limited to PCs, Laptops and small to maybe medium servers, mainly those hosting internet applications. However any CEO must be tempted by the prospect of handing over responsibility for their storage functions to others. Surely it is just a matter of time before the cloud is extended to include large servers and even mainframes.
If you would like to comment on this article then join the Lasconet and make your views known in the blog article.