Slicer: Auto-Sharding for Datacenter Applications

Authors: 

Atul Adya, Daniel Myers, Jon Howell, Jeremy Elson, Colin Meek, Vishesh Khemani, Stefan Fulger, Pan Gu, Lakshminath Bhuvanagiri, Jason Hunter, Roberto Peon, Larry Kai, Alexander Shraer, and Arif Merchant, Google; Kfir Lev-Ari, Technion—Israel Institute of Technology

Abstract: 

Sharding is a fundamental building block of large-scale applications, but most have their own custom, ad-hoc implementations. Our goal is to make sharding as easily reusable as a filesystem or lock manager. Slicer is Google’s general purpose sharding service. It monitors signals such as load hotspots and server health to dynamically shard work over a set of servers. Its goals are to maintain high availability and reduce load imbalance while minimizing churn from moved work.

In this paper, we describe Slicer’s design and implementation. Slicer has the consistency and global optimization of a centralized sharder while approaching the high availability, scalability, and low latency of systems that make local decisions. It achieves this by separating concerns: a reliable data plane forwards requests, and a smart control plane makes load-balancing decisions off the critical path. Slicer’s small but powerful API has proven useful and easy to adopt in dozens of Google applications. It is used to allocate resources for web service front-ends, coalesce writes to increase storage bandwidth, and increase the efficiency of a web cache. It currently handles 2-7M req/s of production traffic. The median production Slicer-managed workload uses 63% fewer resources than it would with static sharding.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {199309,
author = {Atul Adya and Daniel Myers and Jon Howell and Jeremy Elson and Colin Meek and Vishesh Khemani and Stefan Fulger and Pan Gu and Lakshminath Bhuvanagiri and Jason Hunter and Roberto Peon and Larry Kai and Alexander Shraer and Arif Merchant and Kfir Lev-Ari},
title = {Slicer: {Auto-Sharding} for Datacenter Applications},
booktitle = {12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)},
year = {2016},
isbn = {978-1-931971-33-1},
address = {Savannah, GA},
pages = {739--753},
url = {https://www.usenix.org/conference/osdi16/technical-sessions/presentation/adya},
publisher = {USENIX Association},
month = nov
}

Presentation Audio