Realizing the Fault-Tolerance Promise of Cloud Storage Using Locks with Intent

Website Maintenance Alert

Due to scheduled maintenance on Wednesday, October 16, from 10:30 am to 4:30 pm Pacific Daylight Time (UTC -7), parts of the USENIX website (e.g., conference registration, user account changes) may not be available. We apologize for the inconvenience.

If you are trying to register for LISA19, please complete your registration before or after this time period.

Authors: 

Srinath Setty, Microsoft Research; Chunzhi Su, The University of Texas at Austin and Microsoft Research; Jacob R. Lorch and Lidong Zhou, Microsoft Research; Hao Chen, Shanghai Jiao Tong University and Microsoft Research; Parveen Patel and Jinglei Ren, Microsoft Research

Abstract: 

Cloud computing promises easy development and deployment of large-scale, fault tolerant, and highly available applications. Cloud storage services are a key enabler of this, because they provide reliability, availability, and fault tolerance via internal mechanisms that developers need not reason about. Despite this, challenges remain for distributed cloud applications developers. They still need to make their code robust against failures of the machines running the code, and to reason about concurrent access to cloud storage by multiple machines.

We address this problem with a new abstraction, called locks with intent, which we implement in a client library called Olive. Olive makes minimal assumptions about the underlying cloud storage, enabling it to operate on a variety of platforms including Amazon DynamoDB and Microsoft Azure Storage. Leveraging the underlying cloud storage, Olive’s locks with intent offer strong exactly-once semantics for a snippet of code despite failures and concurrent duplicate executions.

To ensure exactly-once semantics, Olive incurs the unavoidable overhead of additional logging writes. However, by decoupling isolation from atomicity, it supports consistency levels ranging from eventual to transactional. This flexibility allows applications to avoid costly transactional mechanisms when weaker semantics suffice. We apply Olive’s locks with intent to build several advanced storage functionalities, including snapshots, transactions via optimistic concurrency control, secondary indices, and live table re-partitioning. Our experience demonstrates that Olive eases the burden of creating correct, fault-tolerant distributed cloud applications.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {199350,
author = {Srinath Setty and Chunzhi Su and Jacob R. Lorch and Lidong Zhou and Hao Chen and Parveen Patel and Jinglei Ren},
title = {Realizing the Fault-Tolerance Promise of Cloud Storage Using Locks with Intent},
booktitle = {12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16)},
year = {2016},
isbn = {978-1-931971-33-1},
address = {Savannah, GA},
pages = {501--516},
url = {https://www.usenix.org/conference/osdi16/technical-sessions/presentation/setty},
publisher = {{USENIX} Association},
month = nov,
}

Presentation Audio