Skip to main content
Back to USENIX
  • Conferences
  • Students
Sign in

USENIX Conference Policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

Are Disks the Dominant Contributor for Storage Failures? A Comprehensive Study of Storage Subsystem Failure Characteristics

Building reliable storage systems becomes increasingly challenging as the complexity of modern storage systems continues to grow. Understanding storage failure characteristics is crucially important for designing and building a reliable storage system. While several recent studies have been conducted on understanding storage failures, almost all of them focus on the failure characteristics of one component - disks - and do not study other storage component failures.

This paper analyzes the failure characteristics of storage subsystems. More specifically, we analyzed the storage logs collected from about 39,000 storage systems commercially deployed at various customer sites. The data set covers a period of 44 months and includes about 1,800,000 disks hosted in about 155,000 storage shelf enclosures. Our study reveals many interesting findings, providing useful guideline for designing reliable storage systems. Some of our major findings include: (1) In addition to disk failures that contribute to 20-55% of storage subsystem failures, other components such as physical interconnects and protocol stacks also account for significant percentages of storage subsystem failures. (2) Each individual storage subsystem failure type and storage subsystem failure as a whole exhibit strong self-correlations. In addition, these failures exhibit ``bursty'' patterns. (3) Storage subsystems configured with redundant interconnects experience 30-40% lower failure rates than those with a single interconnect. (4) Spanning disks of a RAID group across multiple shelves provides a more resilient solution for storage subsystems than within a single shelf.

Weihang Jiang, University of Illinois at Urbana-Champaign

Chongfeng Hu, University of Illinois at Urbana-Champaign

Yuanyuan Zhou, University of Illinois at Urbana-Champaign

Arkady Kanevsky, Network Appliance, Inc.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {268343,
author = {Weihang Jiang and Chongfeng Hu and Yuanyuan Zhou and Arkady Kanevsky},
title = {Are Disks the Dominant Contributor for Storage Failures? A Comprehensive Study of Storage Subsystem Failure Characteristics},
booktitle = {6th USENIX Conference on File and Storage Technologies (FAST 08)},
year = {2008},
address = {San Jose, CA},
url = {https://www.usenix.org/conference/fast-08/are-disks-dominant-contributor-storage-failures-comprehensive-study-storage},
publisher = {USENIX Association},
month = feb
}
Download

Presentation Video

Presentation Audio

MP3 Download OGG Download

Download Audio

Links

Paper: 
http://usenix.org/events/fast08/tech/full_papers/jiang/jiang.pdf
Paper (HTML): 
http://usenix.org/events/fast08/tech/full_papers/jiang/jiang_html/index.html
  • Log in or register to post comments

© USENIX
EIN 13-3055038

  • Privacy Policy
  • Contact Us