Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Home
  • Attend
    • Registration Information
  • Program
  • Participate
    • Instructions for Participants
    • Call for Participation
  • Sponsorship
  • About
    • Summit Organizers
    • Help Promote
    • Questions
    • Past Summits
  • Home
  • Attend
  • Program
  • Participate
  • Sponsorship
  • About

help promote

URES '15 button

Get more
Help Promote graphics!

connect with us


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home » Chaos Patterns—Architecting for Failure in Distributed Systems
Tweet

connect with us

Chaos Patterns—Architecting for Failure in Distributed Systems

Jos Boumans, Krux

Abstract: 

As we architect our systems for greater demands, scale, uptime, and performance, the hardest thing to control becomes the environment in which we deploy and the subtle but crucial interactions between complicated systems. Chaos Patterns help us establish and implement a virtuous cycle that let’s us both prove & improve our system along each of these dimensions before the inevitable happens.

While it may seem reckless or counter-intuitive, our experience has proven that it's a matter of how and when (not if) we will learn about the limitations and failure modes of the system.

This is the story of the pitfalls we encountered, and how, through architecture, convention, and common sense, we managed to build an infrastructure that is "Always Up" from the end-user perspective, and incredibly economical to build, scale and operate. Using chaos testing, we learn more about how our system fails from a 10 second controlled failure than a multi-hour uncontrolled outage.

In this session we will cover various implementation techniques, available to any developer and operator, which will vastly increase the resilience of your systems and provide a superior end user experience—from optimizing your use of DNS for failure, to configuring your CDN to have your back, to synthetic responses and expected database outages.

Jos Boumans, Krux

BibTeX
@conference {208666,
author = {Jos Boumans},
title = {Chaos Patterns{\textemdash}Architecting for Failure in Distributed Systems},
year = {2015},
address = {Washington, D.C.},
publisher = {{USENIX} Association},
month = nov,
}
Download
  • Log in or    Register to post comments

© USENIX

  • Privacy Policy
  • Conference Policies
  • Contact Us