Debugging Linux Issues with eBPF

Tuesday, October 30, 2018 - 2:00 pm2:30 pm

Ivan Babrou, Cloudflare

Abstract: 

This is a technical dive into how we used eBPF to solve real-world issues uncovered during an innocent OS upgrade. We'll see how we debugged 10x CPU increase in Kafka after Debian upgrade and what lessons we learned. We'll get from high-level effects like increased CPU to flamegraphs showing us where the problem lies to tracing timers and functions calls in the Linux kernel.

The focus is on tools what operational engineers can use to debug performance issues in production. This particular issue happened at Cloudflare on a Kafka cluster doing 100Gbps of ingress and many multiple of that egress.

This is also an introductory talk to a training on ebpf_exporter by Alexander Huyhn.

Ivan Babrou, Cloudflare

Ivan is a Performance Engineer at Cloudflare. He spends his days finding performance bottlenecks, fixing them and making sure large chunk of internet runs as fast and as efficiently as possible.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {221746,
author = {Ivan Babrou},
title = {Debugging Linux Issues with eBPF},
year = {2018},
address = {Nashville, TN},
publisher = {{USENIX} Association},
month = oct,
}

Presentation Video 

Presentation Audio