Canarying Well: Lessons Learned from Canarying Large Populations

Thursday, 30 August, 2018 - 16:0016:45

Štěpán Davidovič, Google


Canarying, the process of controlled and observed partial rollout in production to mitigate risk, is one of the common techniques used to ensure safe production changes. In this talk, we will cover common pitfalls, discuss best practices, and outline an end-to-end strategy for the canary process.

Štěpán Davidovič is a Site Reliability Engineer at Google. He currently works on internal infrastructure for automatic monitoring. In previous Google SRE roles, he developed Canary Analysis Service, worked on distributed Cron solution, and has worked on both a wide range of shared infrastructure projects and AdSense reliability. He obtained his bachelor's degree from Czech Technical University, Prague, in 2010.

