Skip to main content
Back to USENIX
  • Conferences
  • Students
Sign in
  • Overview
  • Conference Organizers
  • Registration Information
    • Registration Discounts
    • Hotel and Travel Information
    • Live Streaming
  • Purchase the Box Set
  • Why Attend LISA '13?
    • Watch the Video
  • Convince Your Boss
  • Program
    • At a Glance
    • Calendar
    • Training Program
    • Technical Sessions
    • Invited Speakers
    • Workshops
    • Conference Themes
  • Co-located Events
    • SESA '13
    • Gluster Community Day
    • Puppet Camp DC
    • Data Storage Day
    • Build a Cloud Day
  • Students and Grants
  • Sponsorship and Exhibition
    • Sponsors and Exhibitors
    • Vendor Exhibition
    • Exhibitor Services
    • Download Prospectus
  • Call for Participation
  • For Participants
    • Speaker Resources
  • Help Promote!
    • Flyer PDF
    • Brochure PDF
  • Activities
    • Birds-of-a-Feather Sessions
    • Poster Session
    • Lightning Talks Sign Up Form
    • LISA Lab Hack Space
  • Services
  • Questions
  • Past Conferences

sponsors

Gold Sponsor
Gold Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
General Sponsor
General Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner
Industry Partner
Industry Partner
Industry Partner

USENIX Conference Policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

How Netflix Embraces Failure to Improve Resilience and Maximize Availability

Cloud System Administration

Ariel Tseitlin, Director, Cloud Solutions, Netflix

Netflix created a suite of tools, collectively called the Simian Army, to improve resiliency and maintain the cloud environment. In the typical case, failure modes are corner cases, which are poorly, if at all, tested. It is only by failing often that we can ensure that we are resilient to failure. We look for ways to induce failure in our production environment to better prepare us for the inevitable failures that will occur. This presentation will cover the motivation for inducing failure in production and the mechanics of how Netflix achieves it.

Ariel Tseitlin manages the Netflix Cloud and is interested in all things cloudy. At Netflix, he is Director of Cloud Solutions, helping Netflix be successful in the cloud, including cloud tooling, monitoring, performance and scalability, and cloud operations and reliability engineering. Ariel's team builds Asgard and the Simian Army, including the Chaos Monkey. Prior to Netflix, Ariel was VP of Technology and Products at Sungevity and before that was the Founder and CEO of CTOWorks.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

Presentation Video

Presentation Audio

MP3 Download OGG Download

Download Audio

  • Log in or register to post comments

Gold Sponsors

Silver Sponsors

Bronze Sponsors

General Sponsors

Media Sponsors & Industry Partners

© USENIX
EIN 13-3055038

LISA is a registered trademark of the USENIX Association.

  • Privacy Policy
  • Contact Us