HyperRouter: Lessons Learnt from Building an L4 Load Balancing Service

Wednesday, 8 October, 2025 - 09:0009:45

Linhua Tang and Jayaganesh Kalyanasundaram, Huawei Ireland Research Center

This talk shares hard-won lessons from designing and operating a large-scale Level 4 load balancing service built for high performance, resilience, and reliability. We’ll cover critical design decisions—including choosing DPDK over eBPF/XDP for the data plane, using BGP path prepending for safer node degradation, adopting local health checks, and building a decentralized peer-to-peer control plane to survive network partitions. Beyond architecture, we’ll explore how focusing observability on Critical User Journeys (CUJs) enhanced monitoring and incident response. Intended for engineers, SREs, and architects, this session offers practical insights into building robust, scalable infrastructure with real-world trade-offs and operational strategies that can be applied across distributed systems.

Linhua Tang (also known as James) is a software engineer and tech lead for global server load balancing and formal methods at Huawei Ireland Research Center. Before that, he worked at Microsoft and Amazon in different distributed systems.

Jayaganesh Kalyanasundaram is a principal software engineer for the observability space in Huawei Ireland Research Center. Before that he worked at Google as a tech lead for the CI/CD team.

BibTeX
@conference {311842,
author = {Linhua Tang and Jayaganesh Kalyanasundaram},
title = {{HyperRouter}: Lessons Learnt from Building an L4 Load Balancing Service},
year = {2025},
address = {Dublin},
publisher = {USENIX Association},
month = oct
}

Presentation Video