Qing Wang, Youyou Lu, Erci Xu, Junru Li, Youmin Chen, and Jiwu Shu, Tsinghua University
Distributed shared memory (DSM) is experiencing a resurgence with emerging fast network stacks. Caching, which is still needed for reducing frequent remote access and balancing load, can incur high coherence overhead. In this paper, we propose CONCORDIA, a DSM with fast in-network cache coherence backed by programmable switches. At the core of CONCORDIA is FLOWCC, a hybrid cache coherence protocol, enabled by a collaborative effort from switches and servers. Moreover, to overcome limitations of programmable switches, we also introduce two techniques: (i) an ownership migration mechanism to address the problem of limited memory capacity on switches and (ii) idempotent operations to handle packet loss in the case that switches are stateful. To demonstrate CONCORDIA’s practical benefits, we build a distributed key-value store and a distributed graph engine on it, and port a distributed transaction processing system to it. Evaluation shows that CONCORDIA obtains up to 4.2x, 2.3x and 2x speedup over state-of-the-art DSMs on key-value store, graph engine and transaction processing workloads, respectively.
FAST '21 Open Access Sponsored by NetApp
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Qing Wang and Youyou Lu and Erci Xu and Junru Li and Youmin Chen and Jiwu Shu},
title = {Concordia: Distributed Shared Memory with {In-Network} Cache Coherence},
booktitle = {19th USENIX Conference on File and Storage Technologies (FAST 21)},
year = {2021},
isbn = {978-1-939133-20-5},
pages = {277--292},
url = {https://www.usenix.org/conference/fast21/presentation/wang},
publisher = {USENIX Association},
month = feb
}