Building Service Ownership Using Documentation, Telemetry, and a Chance to Make Things Better

Monday, December 07, 2020 - 1:50 pm2:30 pm

Daniel "Spoons" Spoonhower, Lightstep


Adopting Kubernetes, deploying a service mesh, or breaking up a monolith are all ways of building distributed software systems, but if we are going to build and operate software at scale, we need to think about how to build scalable and distributed people systems too.

In this talk, I'll cover a journey from a monolithic team (and a small set of collectively owned services) to a set of teams and many more services. I'll talk about how to use documentation, divide oncall responsibilities, and set clear objectives, as well as when to ask humans to drive and maintain the process (be it system documentation or alert runbooks) and when to depend on automated processes that use telemetry from the application itself.

Successfully building distributed ownership requires not just defining how we are going to hold teams accountable, but also giving those teams agency to make things better. That agency is often overlooked but is critical to success.

Daniel "Spoons" Spoonhower is CTO and a co-founder at Lightstep. He is an author of Distributed Tracing in Practice (O'Reilly Media, 2020). Previously, Spoons spent almost six years at Google as part of Google's infrastructure and Cloud Platform teams. He has published papers on the performance of parallel programs, garbage collection, and real-time programming. He has a Ph.D. in programming languages from Carnegie Mellon University but still hasn't found one he loves.

