DistCache: Provable Load Balancing for Large-Scale Storage Systems with Distributed Caching


Zaoxing Liu and Zhihao Bai, Johns Hopkins University; Zhenming Liu, College of William and Mary; Xiaozhou Li, Celer Network; Changhoon Kim, Barefoot Networks; Vladimir Braverman and Xin Jin, Johns Hopkins University; Ion Stoica, UC Berkeley
Awarded Best Paper!


Load balancing is critical for distributed storage to meet strict service-level objectives (SLOs). It has been shown that a fast cache can guarantee load balancing for a clustered storage system. However, when the system scales out to multiple clusters, the fast cache itself would become the bottleneck. Traditional mechanisms like cache partition and cache replication either result in load imbalance between cache nodes or have high overhead for cache coherence.

We present DistCache, a new distributed caching mechanism that provides provable load balancing for large-scale storage systems. DistCache co-designs cache allocation with cache topology and query routing. The key idea is to partition the hot objects with independent hash functions between cache nodes in different layers, and to adaptively route queries with the power-of-two-choices. We prove that DistCache enables the cache throughput to increase linearly with the number of cache nodes, by unifying techniques from expander graphs, network flows, and queuing theory. DistCache is a general solution that can be applied to many storage systems. We demonstrate the benefits of DistCache by providing the design, implementation, and evaluation of the use case for emerging switch-based caching.

FAST '19 Open Access Sponsored by NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {227794,
author = {Zaoxing Liu and Zhihao Bai and Zhenming Liu and Xiaozhou Li and Changhoon Kim and Vladimir Braverman and Xin Jin and Ion Stoica},
title = {DistCache: Provable Load Balancing for Large-Scale Storage Systems with Distributed Caching},
booktitle = {2019 {USENIX} Annual Technical Conference ({USENIX} {ATC} 19)},
year = {2019},
isbn = {978-1-931971-48-5},
address = {Renton, WA},
pages = {143--157},
url = {https://www.usenix.org/conference/atc19/presentation/liu},
publisher = {{USENIX} Association},