Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks

Authors: 

Anurag Khandelwal, UC Berkeley; Rachit Agarwal, Cornell University; Ion Stoica, UC Berkeley

Abstract: 

Confluo is an end-host stack that can be integrated with existing network management tools to enable monitoring and diagnosis of network-wide events using telemetry data distributed across end-hosts, even for high-speed networks. Confluo achieves these properties using a new data structure—Atomic MultiLog—that supports highly-concurrent read-write operations by exploiting two properties specific to telemetry data: (1) once processed by the stack, the data is neither updated nor deleted; and (2) each field in the data has a fixed pre-defined size. Our evaluation results show that, for packet sizes 128B or larger, Confluo executes thousands of triggers and tens of filters at line rate (for 10Gbps links) using a single core.

NSDI '19 Open Access Sponsored by NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {225998,
author = {Anurag Khandelwal and Rachit Agarwal and Ion Stoica},
title = {Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks},
booktitle = {16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19)},
year = {2019},
isbn = {978-1-931971-49-2},
address = {Boston, MA},
pages = {421--436},
url = {https://www.usenix.org/conference/nsdi19/presentation/khandelwal},
publisher = {USENIX Association},
month = feb
}

Presentation Video