Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Home
  • Attend
    • Registration
    • Discounts
    • Venue, Hotel, and Travel
    • Why Attend?
    • Students and Grants
    • Speaker Resources
  • Program
    • Program at a Glance
    • Conference Program
    • Training Program
    • Workshop Program
    • Conference Topics
    • Co-Located Events
      • URES '14 West
      • SESA '14
      • Puppet Camp Seattle
      • LISA Data Storage Day
      • CentOS Dojo Seattle
    • Activities
      • LISA Build
      • LISA Lab
      • Birds-of-a-Feather Sessions
      • Poster Session
      • LISA14 Expo
  • Sponsors and Expo
    • LISA14 Expo
    • Sponsors/Exhibitors List
    • Exhibitor Services
    • Download Prospectus (PDF)
  • About
    • Conference Organizers
    • Past Conferences
    • Services
    • Contact Us
    • Code of Conduct
    • Original Call for Participation
    • Help Promote

connect with us


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

why attend lisa?

"LISA is the place where industry best practices and cutting-edge research come together to advance system administration."

Nicole Forsgren Velasquez, Utah State University

"I use LISA to benchmark the SA activities in my company."

LISA '13 Attendee

"LISA is where professionals share what's hot in designing, building, and maintaining critical systems."

Tom Limoncelli, author, speaker, and system administrator

"LISA is the conference that I send my system administrators to so they can bring the latest tools and techniques back to the rest of the team. Much of our current environment can be traced directly back to LISA."

Cory Lueninghoener, Deputy Group Leader of Production High Performance Computing at Los Alamos National Laboratory

"LISA is where professionals share what's hot in designing, building, and maintaining critical systems."

Tom Limoncelli, author, speaker, and system administrator

"LISA is where I find direction for evolving the my core professional skills."

LISA '13 Attendee

"I keep coming back for the technical content and the personal networking opportunities. I attend for career development."

LISA '13 Attendee

"LISA is the conference that I send my system administrators to so they can bring the latest tools and techniques back to the rest of the team. Much of our current environment can be traced directly back to LISA."

Cory Lueninghoener, Deputy Group Leader of Production High Performance Computing at Los Alamos National Laboratory

"Information from LISA helps us push the envelope on automation and scaling, allowing a team of four to manage over 3000 Firefox build and test systems running 15 different operating systems."

Amy Rich, Manager of Release Engineering Operations at Mozilla

help promote

LISA16 CFP button

Get more
Help Promote graphics!

sponsors

Gold Sponsor
Gold Sponsor
Gold Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
General Sponsor
General Sponsor
General Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner
Industry Partner
Industry Partner
Industry Partner
Industry Partner
Industry Partner

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

Site Map

You are here

Home » Best Practices for When s*IT Hits the Fan
Tweet

connect with us

http://twitter.com/lisaconference
https://www.facebook.com/usenixassociation
http://www.linkedin.com/groups/USENIX-Association-49559/about
https://plus.google.com/108588319090208187909/posts
http://www.youtube.com/user/USENIXAssociation

Best Practices for When s*IT Hits the Fan

Invited Talk
Wednesday, November 12, 2014 - 11:45am-12:30pm

Dave Cliffe, PagerDuty

Abstract: 

Outages suck; how you handle them shouldn’t. At PagerDuty, we talk to real customers experiencing real outages all the time. Operations escalations and downtime can be handled in many ways:

  • During the incident: who to alert when, how to communicate, handling dependency and downstream failures, disclosure
  • After the incident: post-mortems, public disclosure, formalizing process vs. investing in automation, preventative actions

There are also ways to keep engineers sane, customers happy, and the $$$ flowing. In this talk, come learn about best practices from across the industry, including how PagerDuty executes during an outage (but trust us, those never happen).

Dave Cliffe, PagerDuty

Dave is an engineer who has adopted a more peaceful role as "sherpa" on the Product team at PagerDuty, a company whose sole goal is to make the lives of DevOps engineers everywhere a calmer, sanity-filled reality. Before PagerDuty, Dave worked in cloud computing at Microsoft on the Windows Azure team. Frequently, he wonders which is scarier: being an on-call engineer responsible for an outage or being a parent. The debate rages on.

LISA16 Open Access Sponsored by Bloomberg

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

Presentation Video 

Presentation Audio

MP3 Download OGG Download

Download Audio

  • Log in or    Register to post comments

Back to Conference Program

Gold Sponsors

Silver Sponsors

Bronze Sponsors

General Sponsors

Media Sponsors & Industry Partners

© USENIX

  • Privacy Policy
  • Contact Us

LISA is a registered trademark of the USENIX Association.