uKharon: A Membership Service for Microsecond Applications

Authors: 

Rachid Guerraoui and Antoine Murat, EPFL; Javier Picorel, Huawei Technologies; Athanasios Xygkis, EPFL; Huabing Yan and Pengfei Zuo, Huawei Technologies

Abstract: 

Modern data center fabrics open the possibility of microsecond distributed applications, such as data stores and message queues. A challenging aspect of their development is to ensure that, besides being fast in the common case, these applications react fast to changes in their membership, e.g., due to reconfiguration and failures. This is especially important as they form the backbone of numerous cloud-powered services, such as analytics and trading systems, trying to meet ever-stringent tail latency requirements. As the microservices-oriented architecture is the de facto standard for building cloud services, a single user request translates to a wide fan-out of microservices interactions sitting on the critical path. The outcome is implacable: the traditionally uncommon events of reconfiguration and failures are exacerbated by the fan-out of communication, making user requests commonly experience such events and quickly impacting the tail latency of the service.

We present uKharon, a microsecond-scale membership service that detects changes in the membership of applications and lets them failover in as little as 50us. uKharon consists of (1) a multi-level failure detector, (2) a consensus engine that relies on one-sided RDMA CAS, and (3) minimal-overhead membership leases, all exploiting RDMA to operate at the microsecond scale. We showcase the power of uKharon by building uKharon-KV, a replicated Key-Value cache based on HERD. uKharon-KV processes PUT requests as fast as the state-of-the-art and improves upon it by (1) removing the need for replicating GET requests and (2) bringing the end-to-end failover down to 53us, a 10x improvement.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {280710,
author = {Rachid Guerraoui and Antoine Murat and Javier Picorel and Athanasios Xygkis and Huabing Yan and Pengfei Zuo},
title = {{uKharon}: A Membership Service for Microsecond Applications},
booktitle = {2022 USENIX Annual Technical Conference (USENIX ATC 22)},
year = {2022},
isbn = {978-1-939133-29-24},
address = {Carlsbad, CA},
pages = {101--120},
url = {https://www.usenix.org/conference/atc22/presentation/guerraoui},
publisher = {USENIX Association},
month = jul
}

Presentation Video