Check out the new USENIX Web site. next up previous
Next: Usage Up: Implementation Previous: Implementation

Error Handling

Ip-rsync loses the atomic update property of rsync. Rsync creates a temporary file that contains the updated version. When the update completes, rsync calls rename() which unlinks and atomically replaces the existing version of the file. Processes with an open handle to the old file continue to read the old data until they reopen the file. New processes open the new version of the file. When operating on a single version, ip-rsync makes many intermediate changes to a file. The old version of the data is irrecoverably modified. Even if ip-rsync completes successfully, inconsistent views of data may occur during its operation; applications reading the file concurrently with an update by ip-rsync may read from both the old and new version.

To avoid these inconsistent views, ip-rsync opens files exclusively (rsync opens files in shared mode by default). If another process/application has the file open, ip-rsync fails, leaving the original file intact. If another process attempts to open the file during an ip-rsync session, the process either fails to open the file or blocks awaiting ip-rsync's completion. The outcome depends on the arguments to open() and operating system semantics.

If a failure occurs during synchronization, ip-rsync may leave the target file in an inconsistent state. This occurs when the network, receiver process or sender process fails after the receiver has written data. If the receiver process fails, no recovery action can be taken. Rsync leaves a temporary file in the file system and ip-rsync leaves an incompletely synchronized file. The inconsistent file left by ip-rsync does not present a problem upon restart; the source contains the new version of the data which is synchronized in a new ip-rsync session against the inconsistent data. If the sender or network fails, the receiver process continues to run and may take recovery action. Rsync merely removes the temporary file, preserving the state of the file prior to synchronization. Ip-rsync cannot recover the original state.

We balance several factors when deciding how ip-rsync should handle inconsistent files. Options include: (1) deleting the file at the receiver and (2) leaving the inconsistent file in the file system. The first approach prevents the application from reading inconsistent data, but discards a file that contains data that makes a subsequent rsync complete quickly. The second approach has the opposite properties, allowing inconsistent data to be read, but preserving the file for faster synchronization. We realize both benefits by having ip-rsync rename the corrupt file, creating a hidden recovery file that contains the inconsistent data. The original file is effectively deleted so that applications cannot access inconsistent data under the old file name. However, the recovery file is available to be used in a subsequent rsync session. When ip-rsync uses the recovery file, it updates the recovery file in-place and then renames the hidden file to the original file name. Renaming the recovery file avoids producing two copies of the file, which might exceed to storage capacity of the target. The recovery file is not implemented in our current release.


next up previous
Next: Usage Up: Implementation Previous: Implementation
2003-04-08