Managing OS Release Transitions at Netflix Scale

Tuesday, October 30, 2018 - 4:00 pm4:30 pm

Edward Hunter, Netflix

Abstract: 

Netflix runs over 150k+ instances of Ubuntu inside the Amazon cloud (AWS) supporting hundreds of micro-services to serve over 125m customers worldwide. A small team of engineers is responsible for maintaining and evolving the base OS (BaseAMI) on which every service depends. Over the past year or so we have migrated the majority of the fleet from Ubuntu's Trusty release to Xenial. When Bionic released we were ready to start moving services very shortly after the release date.

Our goals with the migration were simple:

  1. Don't break Netflix
  2. Minimize developer pain/complexity during the migration
  3. Be ready for the next release of Ubuntu as soon as practical after it's release

Meeting these goals required changes to packaging, tools and processes. This talk will reveal some of what we do to manage the OS and allow Netflix to deploy it quickly to thousands of VMs on a daily basis. It will also look at what it takes to stay up-to-date with patches and other changes in the ecosystem all while supporting our users, both internal and external, 24x7.

Edward Hunter, Netflix

Ed is an engineering leader at Netflix responsible for Performance, OS, and Capacity engineering. Prior to Netflix he spent time at Juniper Networks as a director of OS engineering for Junos. He also spent many years at Sun Microsystems managing part of the Solaris team, working in Sunsoft and Sunlabs. He finished his time at Sun as chief of staff to the CTO.

BibTeX
@conference {221708,
author = {Edward Hunter},
title = {Managing {OS} Release Transitions at Netflix Scale},
year = {2018},
address = {Nashville, TN},
publisher = {{USENIX} Association},
}