sponsors
general information
Early Bird Registration Deadline: March 16, 2016
SREcon16 is SOLD OUT.
No walkup registrations will be accepted.
Venue:
Hyatt Regency Santa Clara
5101 Great America Pkwy
Santa Clara, CA 95054
Rooms at the Hyatt Regency Santa Clara are sold out.
Rooms available at:
Biltmore Hotel & Suites
2151 Laurelwood Road
Santa Clara, CA 95054
Book your room for $225 single or double plus tax or call (800) 255-9925 or (408) 988-8411 and reference USENIX Association or Billing ID #32992. Room rate includes WiFi and complimentary shuttle to the Hyatt Regency Santa Clara.
Questions?
About SREcon?
About the Call for Participation?
About the Hotel/Registration?
About Sponsorship?
help promote
usenix conference policies
How to Improve a Service by Roasting It
Caskey L. Dickson and Jake Welch, Microsoft
At Microsoft SRE is not part of the current operational landscape, instead it is an ongoing project that is being adapted into a VERY mature company. Our team has been having to develop new and interesting ways to introduce SRE and its tenets to a traditional IT-Ops based organization. This process has proven to be quite complex and socially delicate. You can't go in to a team and just tell them they are doing things wrong even if they clearly are (as evidenced by their crushing operational load). You need to find the right way to show a developer all the warts on their baby and motivate them to work with you on addressing them. Furthermore you have to deal with their earnest desire to treat you as "just another ops team" who is only there to take the pager from them.
All that said, one of the tools we've experimented with to get into this kind of conversation is to hold what we call a Service Roast for some of our SRE engagements. Named after the famous friar's club roasts the goal is to (in as safe a manner as possible) dig into and expose those warts, wrinkles, design flaws, shortcomings, and problems everyone knows a service has but doesn't want to talk about. (We can't help you if you won't tell us where it hurts.)
To do these, we've discovered some process, ground rules, a new role of impartial referee, and some useful structure to host this kind of meeting. Thus far we've gotten great insight into some of our services and more importantly created some very interesting (and lively) conversations.
To be sure, this is a high-risk activity, and shouldn't be done without careful consideration of the teams participating, but we'll present what we've learned about holding these roasts, guidance teams need for successful participation, and (importantly) why we don't use this approach everywhere.
Caskey L. Dickson is a Site Reliability Engineer at Microsoft where he is part of the leadership team reinventing operations at Azure. Before that he was at Google where he worked as an SRE/SWE writing and maintaining monitoring services that operate at "Google scale" as well as business intelligence pipelines. He has worked in online services since 1995 when he turned up his first web server and has been online ever since. Before working at Google, he was a senior developer at Symantec, wrote software for various Internet startups such as CitySearch and CarsDirect, ran a consulting company, and even taught undergraduate and graduate computer science at Loyola Marymount University. He has a B.S. in Computer Science, a Masters in Systems Engineering, and an M.B.A from Loyola Marymount.
Jake Welch is a Site Reliability Engineer/Software Engineer on the Microsoft Azure team in NYC. He has worked on large scale services at Microsoft for eight years, primarily in Azure infrastructure and Storage in software engineering/operational/managerial roles and on the major disaster on-call team. In 2014, he started the first SRE pilot in Azure and continues to drive forward Microsoft SRE culture. Prior to Microsoft, Jake worked as a developer building websites and automating backend business workflows across OSX and Windows.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Jake Welch and Caskey L. Dickson},
title = {How to Improve a Service by Roasting It},
year = {2016},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = apr
}
connect with us