Check out the new USENIX Web site.

Home About USENIX Events Membership Publications Students
2001 FREENIX Track Technical Program - Abstract

User-level Checkpointing for LinuxThreads Programs

William R. Dieter, and James E. Lumpp, Jr., University of Kentucky

Abstract

Multiple threads running in a single, shared address space is a simple model for writing parallel programs for symmetric multiprocessor (SMP) machines and for overlapping I/O and computation in programs run on either SMP or single processor machines. Often a long running program's user would like the program to save its state periodically in a checkpoint from which it can recover in case of a failure. This paper introduces the first system to provide checkpointing support for multithreaded programs that use LinuxThreads, the POSIX based threads library for Linux.

The checkpointing library is simple to use, flexible, and efficient. Virtually all of the overhead of the checkpointing system comes from saving the checkpoint to disk. The checkpointing library added no measurable overhead to tested application programs when they took no checkpoints. Checkpoint file size is approximately the same size as the checkpointed process's address space. On the current implementation WATER-SPATIAL from the SPLASH2 benchmark suite saved a 2.8 MB checkpoint in about 0.18 seconds for local disk or about 21.55 seconds for an NFS mounted disk. The overhead of saving state to disk can be minimized through various techniques including varying the checkpoint interval and excluding regions of the address space from checkpoints.

  • View the full text of this paper in HTML form, and PDF form.

  • If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.

  • To become a USENIX Member, please see our Membership Information.
?Need help? Use our Contacts page.

Last changed: 13 Feb 2002 ml
Technical Program
Conference index
USENIX home