Debugging at Scale—Going from Single Box to Production

Thursday, June 07, 2018 - 11:25 am11:50 am

Kumar Srinivasamurthy, Microsoft Corp


It's very easy to launch a debugger on your dev box, attach to the right process and step through code. However, things are different when you need to debug an issue in production that's getting tens of thousands of requests per second. What if the issue reproduces only in production? How do you debug without affecting production traffic? What techniques can you use in your development to make it easier to debug issues? Does your application use tracing? What debug logs are written out to aid in analysis?

This talk will cover:

  1. Challenges with debugging in production
  2. Various approaches that are used in the industry
  3. Examples from Bing & Cortana incidents and steady state problems to illustrate the techniques
  4. How do you design services that make them easier to debug
@conference {214933,
author = {Kumar Srinivasamurthy},
title = {Debugging at Scale{\textemdash}Going from Single Box to Production},
year = {2018},
publisher = {{USENIX} Association},