Copysets: Reducing the Frequency of Data Loss in Cloud Storage

Asaf Cidon; Stephen Rumble; Ryan Stutsman; Sachin Katti; John Ousterhout; Mendel Rosenblum

USENIX Conference Policies

Copysets: Reducing the Frequency of Data Loss in Cloud Storage

Asaf Cidon, Stephen Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout, and Mendel Rosenblum, Stanford University
Awarded Best Student Paper!

Random replication is widely used in data center storage systems to prevent data loss. However, random replication is almost guaranteed to lose data in the common scenario of simultaneous node failures due to cluster-wide power outages. Due to the high fixed cost of each incident of data loss, many data center operators prefer to minimize the frequency of such events at the expense of losing more data in each event.

We present Copyset Replication, a novel general-purpose replication technique that significantly reduces the frequency of data loss events. We implemented and evaluated Copyset Replication on two open source data center storage systems, HDFS and RAMCloud, and show it incurs a low overhead on all operations. Such systems require that each node’s data be scattered across several nodes for parallel data recovery and access. Copyset Replication presents a near optimal tradeoff between the number of nodes on which the data is scattered and the probability of data loss. For example, in a 5000-node RAMCloud cluster under a power outage, Copyset Replication reduces the probability of data loss from 99.99% to 0.15%. For Facebook’s HDFS cluster, it reduces the probability from 22.8% to 0.78%.

Asaf Cidon, Stanford University

Stephen Rumble, Stanford University

Ryan Stutsman, Stanford University

Sachin Katti, Stanford University

John Ousterhout, Stanford University

Mendel Rosenblum, Stanford University

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {180184,
author = {Asaf Cidon and Stephen Rumble and Ryan Stutsman and Sachin Katti and John Ousterhout and Mendel Rosenblum},
title = {Copysets: Reducing the Frequency of Data Loss in Cloud Storage},
booktitle = {2013 USENIX Annual Technical Conference (USENIX ATC 13)},
year = {2013},
isbn = {978-1-931971-01-0},
address = {San Jose, CA},
pages = {37--48},
url = {https://www.usenix.org/conference/atc13/technical-sessions/presentation/cidon},
publisher = {USENIX Association},
month = jun
}

USENIX Conference Policies

Copysets: Reducing the Frequency of Data Loss in Cloud Storage

Asaf Cidon, Stanford University

Stephen Rumble, Stanford University

Ryan Stutsman, Stanford University

Sachin Katti, Stanford University

John Ousterhout, Stanford University

Mendel Rosenblum, Stanford University

Open Access Media

Presentation Video

Presentation Audio

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Media Sponsors & Industry Partners

sponsors

USENIX Conference Policies

Copysets: Reducing the Frequency of Data Loss in Cloud Storage

Asaf Cidon, Stanford University

Stephen Rumble, Stanford University

Ryan Stutsman, Stanford University

Sachin Katti, Stanford University

John Ousterhout, Stanford University

Mendel Rosenblum, Stanford University

Open Access Media

Presentation Video

Presentation Audio

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Media Sponsors & Industry Partners