Libckpt: Transparent Checkpointing under Unix
 
              James S. Plank, Micah Beck, Gerry Kingsley
                      University of Tennessee
                              Kai Li
                       Princeton University
 
Abstract
 
Checkpointing is a simple technique for rollback recovery: the state
of an executing program is periodically saved to a disk file from
which it can be recovered after a failure.  While recent research has
developed a collection of powerful techniques for minimizing the
overhead of writing checkpoint files, checkpointing remains
unavailable to most application developers.  In this paper we
describe libckpt, a portable checkpointing tool for Unix that
implements all applicable performance optimizations which are
reported in the literature.  While libckpt can be used in a mode
which is almost totally transparent to the programmer, it also
supports the incorporation of user directives into the creation of
checkpoints.  This ``user-directed'' checkpointing is an innovation
which is unique to our work.


Download the full text of this paper in ASCII (48,052 bytes) and POSTSCRIPT (277,920 bytes) form.

To Become a USENIX Member, please see our Membership Information.