Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Home
  • Attend
    • Registration Information
    • Registration DIscounts
    • Venue, Hotel, and Travel
    • Students and Grants
  • Program
    • At a Glance
    • Technical Sessions
    • Training Program
    • Poster Sessions
    • WiPs
  • Activities
    • Birds-of-a-Feather Sessions
    • Poster Sessions
  • Sponsorship
  • Participate
    • Call for Papers
    • Call for Posters and WiPs
    • Instructions for Participants
  • About
    • Conference Organizers
    • Questions?
    • Services
    • Help Promote!
    • Past Conferences
  • Home
  • Attend
    • Registration Information
    • Registration DIscounts
    • Venue, Hotel, and Travel
    • Students and Grants
  • Program
    • At a Glance
    • Technical Sessions
    • Training Program
    • Poster Sessions
    • WiPs
  • Activities
  • Sponsorship
  • Participate
    • Call for Papers
    • Call for Posters and WiPs
    • Instructions for Participants
  • About
    • Conference Organizers
    • Questions?
    • Services
    • Help Promote!
    • Past Conferences

sponsors

Platinum Sponsor
Gold Sponsor
Gold Sponsor
Gold Sponsor
Gold Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
General Sponsor
General Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner
Industry Partner
Industry Partner

help promote

FAST '15 button

Get more
Help Promote graphics!

connect with us


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home ยป RAIDShield: Characterizing, Monitoring, and Proactively Protecting Against Disk Failures
Tweet

connect with us

http://twitter.com/usenix
https://www.facebook.com/pages/USENIX-Association/124487434386
http://www.linkedin.com/groups/USENIX-Association-49559/about
https://plus.google.com/108588319090208187909/posts
http://www.youtube.com/user/USENIXAssociation

RAIDShield: Characterizing, Monitoring, and Proactively Protecting Against Disk Failures

Authors: 

Ao Ma, Fred Douglis, Guanlin Lu, and Darren Sawyer, EMC Corporation; Surendar Chandra and Windsor Hsu, Datrium, Inc.

Abstract: 

Modern storage systems orchestrate a group of disks to achieve their performance and reliability goals. Even though such systems are designed to withstand the failure of individual disks, failure of multiple disks poses a unique set of challenges. We empirically investigate disk failure data from a large number of production systems, specifically focusing on the impact of disk failures on RAID storage systems. Our data covers about one million SATA disks from 6 disk models for periods up to 5 years. We show how observed disk failures weaken the protection provided by RAID. The count of reallocated sectors correlates strongly with impending failures.

With these findings we designed RAIDSHIELD, which consists of two components. First, we have built and evaluated an active defense mechanism that monitors the health of each disk and replaces those that are predicted to fail imminently. This proactive protection has been incorporated into our product and is observed to eliminate 88% of triple disk errors, which are 80% of all RAID failures. Second, we have designed and simulated a method of using the joint failure probability to quantify and predict how likely a RAID group is to face multiple simultaneous disk failures, which can identify disks that collectively represent a risk of failure even when no individual disk is flagged in isolation. We find in simulation that RAID-level analysis can effectively identify most vulnerable RAID-6 systems, improving the coverage to 98% of triple errors.

Ao Ma, EMC Corporation

Fred Douglis, EMC Corporation

Guanlin Lu, EMC Corporation

Darren Sawyer, EMC

Surendar Chandra, Datrium, Inc.

Windsor Hsu, Datrium, Inc.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {188450,
author = {Ao Ma and Fred Douglis and Guanlin Lu and Darren Sawyer and Surendar Chandra and Windsor Hsu},
title = {{RAIDShield}: Characterizing, Monitoring, and Proactively Protecting Against Disk Failures},
booktitle = {13th USENIX Conference on File and Storage Technologies (FAST 15)},
year = {2015},
isbn = {978-1-931971-201},
address = {Santa Clara, CA},
pages = {241--256},
url = {https://www.usenix.org/conference/fast15/technical-sessions/presentation/ma},
publisher = {USENIX Association},
month = feb,
}
Download
Ma PDF
View the slides

Presentation Video 

Presentation Audio

MP3 Download

Download Audio

  • Log in or    Register to post comments

Platinum Sponsors

Gold Sponsors

Bronze Sponsors

General Sponsors

Media Sponsors & Industry Partners

© USENIX

  • Privacy Policy
  • Contact Us