sponsors
usenix conference policies
You are here
To Err Is Human, to Log Divine: Expediting Production Failure Diagnosis with Better Logging
Ding Yuan, University of California, San Diego; University of Illinois at Urbana-Champaign; University of Toronto
When systems fail in the field, logged data are frequently the only evidence available for support engineers and developers to assess and diagnose the underlying cause. Consequently, the efficacy of such logging data is a matter of significant practical importance. We have empirically studied tens of thousands of log messages and hundreds of production failures from several widely-used systems, and built several tools for log automation and postmortem log analysis. In this talk, I will summarize our experiences on exploring questions such as "How much do log messages really help in debugging?", "Are they good enough?", "What are the opportunities for improving log qualities?", "Can we automatically improve log messages?", and "How can we automate the log inference?" I will also discuss where the greatest opportunities for impact are likely to be found in the future.
This talk is based on joint work with: Y. Zhou, P. Huang, S. Park, J. Zheng, H. Mai, Y. Liu, M. Lee, X. Tang, W. Xiong, L. Tan, S. Savage, and S. Pasupathy.
Ding Yuan is a graduating Ph.D. candidate in the University of Illinois at Urbana-Champaign and a visiting student at the University of California, San Diego. He will join the University of Toronto as an assistant professor in 2013. His research focuses on practical approaches for failure diagnosis via log messages. He has received two ASPLOS best paper nominations, an ACM SIGSOFT Distinguished Paper Award, an Outstanding Teaching Assistant Award, and a Saburo Muroga Fellowship. His research systems on failure diagnosis has been requested for release by large vendors including Cisco, EMC, Huawei, NetApp, and Qualcomm.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Ding Yuan},
title = {To Err Is Human, to Log Divine: Expediting Production Failure Diagnosis with Better Logging },
year = {2012},
address = {Hollywood, CA},
publisher = {USENIX Association},
month = oct
}
connect with us