Check out the new USENIX Web site.

Home About USENIX Events Membership Publications Students
Abstract - Technical Program - 2nd USENIX Windows NT Symposium

A Transparent Checkpoint Facility On NT

Johny Srouji, Paul Schuster, Maury Bach, and Yulik Kuzmin
Intel Corporation

Abstract

With the increased use of networks of NT workstations for long-running engineering applications, process checkpointing and process migration can avoid wasted computer cycles and improve system utilization. The problem we solve is how to capture and reconstruct process state transparently and efficiently without affecting the correctness of the application.

A checkpoint facility enables the intermediate state of a process to be saved to a file. Users can later resume execution of the process from the checkpoint file. This prevents the loss of data generated by long-running processes due to program or system failures, and it also facilitates debugging when the bug appears after the program has executed for a long time.

This paper describes the implementation of a checkpoint library that permits users to save temporary state of long-running multi-threaded programs on a Windows/NT system and to resume execution from the checkpointed state at a later time. Our Windows implementation is the first such implementations that we are aware of for this operating system. Our implementation is portable, maintains good performance, and is transparent.

The checkpoint facility is currently used in several major internal projects at Intel.

  • View the full text of this paper in HTML form and PDF form.

  • If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.

  • To become a USENIX Member, please see our Membership Information.

?Need help? Use our Contacts page.

Last changed: 9 April 2002 aw
Technical Program
Conference Index
USENIX home