Skip to main content
USENIX
  • Conferences
  • Students
Sign in

connect with us


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home ยป Beyond Availability: Towards a Deeper Understanding of Machine Failure Characteristics in Large Distributed Systems
Tweet

connect with us

Beyond Availability: Towards a Deeper Understanding of Machine Failure Characteristics in Large Distributed Systems

Abstract: 

Although many previous research efforts have investigated machine failure characteristics in distributed systems, availability research has reached a point where properties beyond these initial findings become important. In this paper, we analyze traces from three large distributed systems to answer several subtle questions regarding machine failure characteristics. Based on our findings, we derive a set of fundamental principles for designing highly available distributed systems. Using several case studies, we further show that our design principles can significantly influence the availability design choices in existing systems.

Praveen Yalagandula, The University of Texas at Austin

Suman Nath, Carnegie Mellon University

Links

Paper: 
http://usenix.org/publications/library/proceedings/worlds04/tech/full_papers/yalagandula/yalagandula.pdf
Paper (HTML): 
http://usenix.org/publications/library/proceedings/worlds04/tech/full_papers/yalagandula/yalagandula_html/index.html
  • Log in or    Register to post comments

© USENIX

  • Privacy Policy
  • Contact Us