Sledgehammer: Cluster-Fueled Debugging

Authors: 

Andrew Quinn, Jason Flinn, and Michael Cafarella, University of Michigan

Abstract: 

Current debugging tools force developers to choose between power and interactivity. Interactive debuggers such as gdb let them quickly inspect application state and monitor execution, which is perfect for simple bugs. However, they are not powerful enough for complex bugs such as wild stores and synchronization errors where developers do not know which values to inspect or when to monitor the execution. So, developers add logging, insert timing measurements, and create functions that verify invariants. Then, they re-run applications with this instrumentation. These powerful tools are, unfortunately, not interactive; they can take minutes or hours to answer one question about a complex execution, and debugging involves asking and answering many such questions.

In this paper, we propose cluster-fueled debugging, which provides interactivity for powerful debugging tools by parallelizing their work across many cores in a cluster. At sufficient scale, developers can get answers to even detailed queries in a few seconds. Sledgehammer is a cluster-fueled debugger: it improves performance by timeslicing program execution, debug instrumentation, and analysis of results, and then executing each chunk of work on a separate core. Sledgehammer enables powerful, interactive debugging tools that are infeasible today. Parallel retro-logging allows developers to change their logging instrumentation and then quickly see what the new logging would have produced on a previous execution. Continuous function evaluation logically evaluates a function such as a data-structure integrity check at every point in a program’s execution. Retro-timing allows fast performance analysis of a previous execution. With 1024 cores, Sledgehammer executes these tools hundreds of times faster than single-core execution while returning identical results.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {222625,
author = {Andrew Quinn and Jason Flinn and Michael Cafarella},
title = {Sledgehammer: {Cluster-Fueled} Debugging},
booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)},
year = {2018},
isbn = {978-1-939133-08-3},
address = {Carlsbad, CA},
pages = {545--560},
url = {https://www.usenix.org/conference/osdi18/presentation/quinn},
publisher = {USENIX Association},
month = oct
}

Presentation Audio