Check out the new USENIX Web site. next up previous
Next: 2.4 Restoring data Up: 2 The simplified scheme Previous: 2.2 Creating a reliable


2.3 Backing up data

Each computer backs up its data on the reliable logical disk it has constructed from its partners' disks. Exactly how this is done--e.g., incrementals vs. full backups, compression, ignoring caches and application binaries, etc.--is orthogonal to our scheme; we assume here only that backups occur at most daily. Because our backup space is of limited size and somewhat expensive compared to removable media, it may be useful to conserve space. In particular, one may not want to require backup space for two full snapshots so that a crash while writing a new snapshot does not leave the system without a viable backup. We demonstrate one way of doing this in Section 4.1.

The following procedure can be used to stream a snapshot to logical disk starting on a block stripe boundary: For each $k$ data blocks of the stream, perform the following operations in turn: use the erasure-correcting code to generate the $m$ redundancy blocks from the data blocks, generate and attach a checksum and the same new version number to each of the $k{+}m$ blocks, send a request to write each block to the appropriate partner in the appropriate place, and, finally wait for acknowledgments from at least $w \geq \max(k,m+1)$ partners. It is important that we write at least $k$ blocks to the current block stripe before starting to write the next one to ensure that a crash while writing will leave at most one block stripe unreadable; we need to write at least $m{+}1$ blocks to ensure that the old version is overwritten.

The parameter $w$ here represents a tradeoff. Smaller values of $w$ allow the backup to proceed faster because it is not necessary to wait for as many partners to be up, but the resulting written data has lower reliability than normal because some of the blocks are missing: only $w{-}k$ failures can be tolerated before some of the written data is unrecoverable. $w$ should be chosen based on empirical data and the uptime-level agreements being used. Note that if more than $n{-}w$ partners fail, it will no longer be possible to make new backups with this procedure until some of the failing partners have been replaced. Alternatively, $w$ could be updated based on the number of partners scheduled to be replaced.

We run a cleaner in the background on each computer to help limit how long recently written data has less than the maximum redundancy available. The cleaner scans its logical disk looking for incompletely-written block stripes. Each time it finds such a block stripe, it reads as many blocks as it can from it and tries to decode the stripe. If it succeeds, it generates the missing blocks and writes them to the appropriate partners, thus increasing the stripe's redundancy. Note that both the cleaner and streaming procedure use only a block stripe worth of extra local storage, avoiding the need for an extra snapshot worth of temporary disk space during backing up.


next up previous
Next: 2.4 Restoring data Up: 2 The simplified scheme Previous: 2.2 Creating a reliable
Mark Lillibridge 2003-04-07