Mirroring modes

Next: Error correcting codes Up: Related Work Previous: Related Work

Mirroring modes

Synchronous mirroring, like IBM's Peer-to-Peer Remote Copy (PPRC) [6] and EMC's Symmetrix Remote Data Facility (SRDF) [12] is a technique often used in disaster tolerance solutions. It guarantees that local copies of data are consistent with copies at a remote site, and also guarantees that the mirror sites are as up-to-date as possible. Naturally, the drawback is that of added I/O latency to every write operation; furthermore, long distance links make this technique prohibitively expensive.

An alternate solution is to use asynchronous remote mirroring [19,24,31]. For example, SnapMirror [31] provides asynchronous mirroring of file systems by periodically transferring self-consistent data snapshots from a source volume to a destination volume. Users are provided with a knob for setting the frequency of updates -- if set to a high value, the mirror would be nearly current with the source, while setting to a low value reduces the network bandwidth consumption at the risk of increased data loss. Seneca [19] is a storage area network mirroring solution and similarly attempts to reduce the amount of traffic sent over the wide-area network.

SnapMirror works at the block level, using the WAFL [17] file system active block map to identify changed blocks and avoid sending deleted blocks. Moreover, since it operates at this level, it is able to optimize data reads and writes. The authors showed that for update intervals as short as one minute, data transfers were reduced by 30% to 80%.

Similar to SnapMirror, Seneca [19] is another asynchronous mirroring solution that attempts to reduce the traffic sent over the wide-area network, but also increases the risk of data loss. Seneca operates at the level of a storage area network (SAN) instead of the file system level.

Semi-synchronous mirroring is yet another mode of operation, closely related to both synchronous and asynchronous mirroring. In such a mode, writes are sent to both the local and the remote storage sites at the same time, the I/O operation returning when the local write is completed. However subsequent write I/O is delayed until the completion of the preceding remote write command. In [42] the authors show that by leveraging a log policy for the active remote write commands the system is able to allow a limited number of write I/O operations to proceed before waiting for acknowledgment from the remote site, thereby reducing the latency significantly.

Next: Error correcting codes Up: Related Work Previous: Related Work

Hakim Weatherspoon 2009-01-14