Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Home
  • Attend
    • Registration Information
    • Registration Discounts
    • Venue, Hotel, and Travel
    • Students and Grants
  • Program
    • At a Glance
    • Technical Sessions
    • Training Program
    • Poster Sessions
    • WiPs
  • Activities
    • Birds-of-a-Feather Sessions
    • Poster Sessions
  • Sponsorship
  • Participate
    • Call for Papers
    • Call for Posters and WiPs
    • Instructions for Participants
  • About
    • Conference Organizers
    • Questions
    • Services
    • Help Promote!
    • Past Conferences
  • Home
  • Attend
  • Program
  • Activities
  • Sponsorship
  • Participate
  • About

sponsors

Platinum Sponsor
Gold Sponsor
Gold Sponsor
Gold Sponsor
Gold Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner
Industry Partner

help promote

FAST '17 CFP

Get
Help Promote graphics!

connect with us


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home ยป Uncovering Bugs in Distributed Storage Systems during Testing (Not in Production!)
Tweet

connect with us

Uncovering Bugs in Distributed Storage Systems during Testing (Not in Production!)

Authors: 

Pantazis Deligiannis, Imperial College London; Matt McCutchen, Massachusetts Institute of Technology; Paul Thomson, Imperial College London; Shuo Chen, Microsoft; Alastair F. Donaldson, Imperial College London; John Erickson, Cheng Huang, Akash Lal, Rashmi Mudduluru, Shaz Qadeer, and Wolfram Schulte, Microsoft

Abstract: 

Testing distributed systems is challenging due to multiple sources of nondeterminism. Conventional testing techniques, such as unit, integration and stress testing, are ineffective in preventing serious but subtle bugs from reaching production. Formal techniques, such as TLA+, can only verify high-level specifications of systems at the level of logic-based models, and fall short of checking the actual executable code. In this paper, we present a new methodology for testing distributed systems. Our approach applies advanced systematic testing techniques to thoroughly check that the executable code adheres to its high-level specifications, which significantly improves coverage of important system behaviors.

Our methodology has been applied to three distributed storage systems in the Microsoft Azure cloud computing platform. In the process, numerous bugs were identified, reproduced, confirmed and fixed. These bugs required a subtle combination of concurrency and failures, making them extremely difficult to find with conventional testing techniques. An important advantage of our approach is that a bug is uncovered in a small setting and witnessed by a full system trace, which dramatically increases the productivity of debugging.

Pantazis Deligiannis, Imperial College London

Matt McCutchen, Massachusetts Institute of Technology

Paul Thomson, Imperial College London

Shuo Chen, Microsoft

Alastair F. Donaldson, Imperial College London

John Erickson, Microsoft

Cheng Huang, Microsoft

Akash Lal, Microsoft

Rashmi Mudduluru, Microsoft

Shaz Qadeer, Microsoft

Wolfram Schulte, Microsoft

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {194442,
author = {Pantazis Deligiannis and Matt McCutchen and Paul Thomson and Shuo Chen and Alastair F. Donaldson and John Erickson and Cheng Huang and Akash Lal and Rashmi Mudduluru and Shaz Qadeer and Wolfram Schulte},
title = {Uncovering Bugs in Distributed Storage Systems during Testing (Not in {Production!})},
booktitle = {14th USENIX Conference on File and Storage Technologies (FAST 16)},
year = {2016},
isbn = {978-1-931971-28-7},
address = {Santa Clara, CA},
pages = {249--262},
url = {https://www.usenix.org/conference/fast16/technical-sessions/presentation/deligiannis},
publisher = {USENIX Association},
month = feb,
}
Download
Deligiannis PDF
View the slides

Presentation Audio

MP3 Download

Download Audio

  • Log in or    Register to post comments

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Media Sponsors & Industry Partners

Open Access Publishing Partner

© USENIX

  • Privacy Policy
  • Contact Us