How Could Small Teams Get Ready for SRE

Monday, May 22, 2017 - 10:55am11:20am

Zehua Liu, Zendesk Singapore

Abstract: 

Site Reliability Engineering encompasses a large area of topics. The SRE book itself contains 34 chapters in 500+ pages. It’s not easy for a small team to start to adopt these SRE practices. At Zendesk Singapore, we went through the initial chaos of engineering and reliability issues when the engineering team grew from 10 to 40 engineers and the product focus shifted from SME to enterprise customers. Several initiatives that we took during this period helped stabilize the product and got the team into a shape where it’s ready to apply more SRE best practices from the SRE book and other sources. In this talk, we will share the details about some of the projects that we consider as essential in preparing a small and young team to tackle more serious site reliability issues. We will discuss how some of these key ideas could be combined to form foundations of the principles discussed in the SRE book. We hope that this talk could help teams facing similar growth and product change issues better cope with them while keeping the product reliable.

Zehua Liu, Zendesk Singapore

Zehua establishes and leads the tooling team at Zendesk, where he works on making sure that developers are happy developing what they want to develop and the quality of the products the developers deliver is great. He is currently based in Singapore.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {202727,
author = {Zehua Liu},
title = {How Could Small Teams Get Ready for {SRE}},
year = {2017},
publisher = {USENIX Association},
month = may,
}

Presentation Video 

Presentation Audio