- IBM Metro Mirror
- PPRC Commands
- EMC SRDF mirroring
- SRDF Commands
- HDS Universal Replicator
- Global Mirror
- z/OS Global Mirror
Symmetrix Remote Data Facility (SRDF) is a software family for remote storage replication solutions, and it works between two Symmetrix storage arrays that are connected by a SAN or an IP network. The local SRDF device is caled the primary or RDF1, and the remote device is called the secondary or RDF2. SRDF is mainly used for disaster recovery, but also for remote backup and data center migration.
A few words of warning. EMC often uses the term 'mirroring' to describe RAID1 protection, which can cause SRDF confusion. A local volume can be RAID1 mirrored and also in an SRDF Mirror relationship. In these pages, mirroring means remote mirroring between symms. The disks in diagrams are shown as single disks, but they will be a RAID1, mirrored pair.
SRDF has 4 modes. The choice basically depends on whether you want the best possible performance, or to be absolutely sure that your data is consistent between sites.
In this mode, a copy of the data must be stored in cache in both local and remote machines, before the calling application is signaled that the I/O is complete. This means that data consistency between sites is guaranteed. The two Symms can be up to 200km apart, but performance might progressively degrade with distance once the remote symmetrix is more that 15k away.
Data is written asynchronously to the secondary device and can be up to 65535 IOs behind the primary. Data which has not been copied are called 'dirty tracks', and the amount of dirty tracks permissible is set by a 'skew value' parameter. If the skew value is exceeded, then the mode switches to Synchronous or Semi-synchronous until the remote symm catches up. At that point, it switches back to asynchronous mode. Asynchronous mirroring is useful where sites are too far apart for synchronous operation, and some data loss is acceptable.
The data on a secondary logical volume can be one write I/O behind the primary, which may sound almost as good as Synchronous, but Semi-synch will not give you I/O consistency across volumes. The local symmetrix will return Channel end / Device end once a write I/O is safely in the local cache, and then it sets the logical volume to busy status, so it will not accept any more writes. Then SRDF passes the write I/O to the remote symmetrix, and once it is safely stored in cache there, the busy flag is removed from the logical volume.
The advantage of Semi-synch is that the application does not have to wait for the remote I/O to complete, so performance does not suffer.
The disadvantage is that in a disaster there is no guarantee that all the I/Os that an application thinks it completed actually made it to the remote site. There could be several write I/Os queued up in the local controller (one for each logical disk) and these are processed by a FIFO queue. If an application is sending I/Os to more than one controller, there is no FIFO synchronisation between controllers so the remote data could be inconsistent.
This mode is intended for electronically moving data between sites. There is no I/O consistency across volumes; data is simply moved without any acknowledgment
SRDF device can have three different states with respect to the hosts. R1 devices are from the host connected to the primary system and R2 devices from the host connected to the secondary system.
A three-site configuration. The Primary or R1 site synchronously mirrors the data to a Secondary or R21 site, which then asynchronously mirrors the data to a third, or R2 site. This means that if you lose your Primary site, you still have a working, mirrored configuration that you can use quite quickly to continue your production service. Either of the second or third sites can be used as the new primary, but typically Open systems will fail over to the third site while mainframe solutions will fail over to the secondy site.
Another three-site disaster recovery solution where the primary devices are mirrored locally to two R1 SRDF mirrors, each configured to a different SRDF group. These two R1 mirrors are then mirrored concurrently to two R2 devices that can reside in one or two remote Symmetrix systems.
The problem Cascaded SDRF is that if the secondary site is lost, then while production service will be unaffected, it is now running without a DR position. If the primary site is lost in a Concurrent SRDF topology, then while the service can fail over to one of the secondaries, there is no DR link between them. SRDF STAR resolves those issues, by providing extra links between sites, shown as orange in the digram below. SRDF uses differential synchronization between the two surviving sites, which allows SRDF/Star to rapidly re-establish cross-site mirroring to the new production site.
CONCURRENT SRDF/STAR, the data is synchronously mirrored between SiteA and SiteB, and asynchronously mirrored between SiteA and SiteC. Links exist between SiteB and SiteC, so that if SiteA is lost, differential mirroring can be established between siteB and SiteC, and so a DR position can be maintained. Production systems will have to be switched to SiteB, so there will be some impact on services.
CASCADED SRDF/STAR, the data is synchronously mirrored between SiteA and SiteB, and asynchronously mirrored between SiteB and SiteC. Links exist between SiteA and SiteC, so that if SiteB is lost, differential mirroring can be established between siteA and SiteC, and so a DR position can be maintained. As SiteA is not affected, there should be no service impact.
SRDF also supports Four-site solutions for optimal recovery from regional disasters. The four-site SRDF solution for open systems host environment replicates FBA data by using both concurrent and cascaded SRDF topologies. SRDF/SQAR (Symmetrix Quadrilateral Asynchronous Replication) is a four-site implementation of SRDF/S and SRDF/A for mainframe host environments. EMC GDDR is required to implement SRDF/SQAR.
SRDF volumes must be formed into SRDF device groups. There are two different types of SRDF device group, RDF1 and RDF2, corresponding to primary and secondary devices. Each device group contains the SRDF devices and SRDF directors that reside on the Symmetrix system. The device groups come in pairs, one on the local symm and one on the mirror symm. If a primary device has several mirrors, then each mirror needs to be in a separate device group. The command to create a pair of new SRDF groups looks like this, but many of the parameter values will depend on your site. The CLI commands needed to set up and manage SRDF can be found on the SRDF commands page.
A Consistency Group is a collection of volumes in one or more symmetrix devices that need to be kept in a consistent state. If a write to a Symmetrix cannot be
propagated to the Remote Site, the Symmetrix will hold the I/O for a fixed period of time. At the same time it presents a SIMM back to the host. The Congroup STC will
detect the SIMM and issue the equivalent of PPRC FREEZE to all the other Symmetrix online to that Host. All Volumes in that consistency Group will then be suspended.
Once they are all suspended the equivalent of PPRC RUN is issued and I/O can complete, including the first I/O that triggered the SIMM.
Consistency Group processing with SRDF does not lose data because it employs a FREEZE/RUN approach similar to PPRC FREEZE/RUN.
To create a consistency group, add devices to it and enable it, you use commands
symcg create r1_cg001 -type rdf1
symcg -cg r1_cg001 -sid 1234 add dev 0220
symcg -cg r1_cg001 -sid 0011 add dev 001C
symcg -cg r1_cg001 enable