“Disorganizing” Your SRE Organization

Tuesday, December 08, 2020 - 2:10 pm2:50 pm

Leonid Belkind, StackPulse

Abstract: 

The COVID-driven new WFH/all-remote model has amplified traditional challenges remote teams can face with incident response and reliability. Silos and reduced information exchange, challenges onboarding or cross training engineers, increased noise and toil – just to name a few. All of these make it harder for teams to continue to deliver reliable services, at a time when reliable software is what's keeping the world connected.

Instead of trying to translate existing roles and responsibilities, processes, and methods to the new normal – we'll share how to improve reliability by 'dis-organizing' and democratizing the SRE function – empowering the entire engineering team to own reliability and adopt SRE mindset. We'll cover goals for SREs/SWEs, training for SWEs, automating knowledge management/sharing, getting started with code-based incident response playbooks – and the role of the SRE in orchestrating it all.

Leonid Belkind, StackPulse

Leonid Belkind is a Co-Founder and CTO at StackPulse, a Site Reliability Engineering orchestration platform. Prior to StackPulse, Leonid co-founded (and was CTO of) Luminate where he guided this enterprise-grade service from inception, to widespread Fortune 500 adoption to acquisition by Symantec. Before Luminate, Leonid managed software development organizations at CheckPoint.

Through his career, Leonid has witnessed modern Software Engineering practices come and replace the traditional ones, first around Continuous Integration and Delivery pipelines, then Infrastructure Management and Monitoring, and onwards as software services have replaced on-premise products. Throughout this journey Leonid has become passionate about building reliability-first architectures, methodologies and organizational culture.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {262239,
author = {Leonid Belkind},
title = {{{\textquotedblleft}Disorganizing{\textquotedblright}} Your {SRE} Organization},
booktitle = {SREcon20 Americas (SREcon20 Americas)},
year = {2020},
url = {https://www.usenix.org/conference/srecon20americas/presentation/belkind},
publisher = {USENIX Association},
month = dec
}

Presentation Video