More and more companies have a requirement for 7*24 hour service, and can neither afford to be off-line for a couple of days after a problem, or even have systems down for a couple of hours for maintenance. As the cost of hardware is now relatively cheap, compared to the cost of downtime, its economic to have 2 data centres, and mirror the data between them. This means you can have very fast disaster recovery, and you can also switch applications to your secondary site, while you close the primary site for maintenance.
There are basically 2 sorts of mirroring;
Synchronous, which means the data updates at each site are kept precisely in step. However, the applications have to wait for updates to both disk systems to complete before a transaction is complete.
Asynchronous, which means data updates at the secondary can be a few seconds behind the primary, but the mirroring has practically no effect on application performance.
The choice is yours, which is most important to you, performance or data integrity? If you look after my bank account, then it should be data integrity, please.
Two numbers are often come up when Disaster Recovery is discussed Recovery Point Objective (RPO) and Recovery Time Objective (RTO).
RPO basically means how much data can you afford to lose in terms of time. The RPO for a nightly backup to tape will be 24 hours plus the time it takes to get your tapes offsite. The RPO for synchronous mirroring is zero.
RTO means how long will it take to get your services back up and running. The RTO for a full restore from tape backups could be about 4 days, depending on how much data you have. The RTO for synchronous mirroring to a remote site with automated failover is up to 2 hours.
SHARE once defined seven different disaster recovery ability tiers. GDPS adds an eighth tier as shown below. Higher tiers offer better data protection, but are more expensive than lower tiers.
|TIER||Data loss||Recovery time||Comment|
|Tier 0||All||N/A||No Disaster Recovery plan, no offsite backups, all data is lost and recovery is not possible.|
|Tier 1||Up to 48 hours worth of updates||48 hours plus|| Pickup Truck Access Method (PTAM). All the data needed to recover the system is dumped to tape and periodically moved offsite.
Any data that has not been moved offsite will be lost in the event of a disaster
In a disaster, the first issue is to find a suitable recovery site with suitable hardware installed. Then the backup tapes must be taken to that site and the system, applications and application data restored. This could take several days.
|Tier 2||24-48 hours worth||24-48 hours||PTAM and Hot Site. This improves on Tier 1 as a DR facility is pre-arranged.
Recovery time is then just the time to restore the data.
|Tier 3||Up to 24 hours||Up to 24 hours Electronic vaulting.||This is the same as Tier 2 except that the backup data is taken or copied to a remotely-attached tape library subsystem. Data loss will depend on when the last backup was created.|
|Tier 4||Minutes||Up to 24 hours||Active Secondary Site (electronic remote journaling) This is an extension of Tier 3. As well as offsite backups, transaction and DBMS recovery logs are copied to the DR site.|
|Tier 5||seconds||Less than 2 hours||Two-Site Two-Phase Commit This extends Tier 4 with applications performing two-phase commit processing between two sites. Data loss will be seconds and the recovery time will be 2 hours or less.|
|Tier 6||Seconds or Minutes||Less than 2 hours|| Zero Data Loss (remote copy) The system, the subsystem, and application
infrastructure along with application data is mirrored to a DR site. The data loss will depend on the mirroring type, with seconds, or maybe zero data loss for synchronous mirroring and seconds to minutes if using asynchronous mirroring.
The recovery window will be the time required to restart the servers and application from the secondary disks. However be aware that databases might not start up as their data components might not be consistent.
|Tier 7||None / seconds||1-2 hours||Geographically Dispersed Parallel Sysplex GDPS adds another level to the SHARE-defined DR tiers since it provides total IT business recovery. DMBS data is synchronised at the point of failure, and failover to a DR site can be largely automated. GDPS is discussed in detail elsewhere in this site.|