Ben Hartshorne, Honeycomb
The two main methods of reducing high volume instrumentation data to a manageable load are aggregation and sampling. Aggregation is well understood, but sampling remains a mystery.
We'll start by laying down the basic ground rules for sampling—what it means and how to implement the simplest methods. There are many ways to think about sampling, but with a good starting point, you gain immense flexibility. Once we have the basics of what it means to sample, we'll look at some different traffic patterns and the effect of sampling on each. When do you lose visibility into your service with simple sampling methods? What can you do about it?
Given the patterns of traffic in a modern web infrastructure, there are some solid methods to change how you think about sampling in a way that lets you keep visibility into the most important parts of your infrastructure while maintaining the benefits of transmitting only a portion of your volume to your instrumentation service.
Taking it a step further, you can push these sampling methods beyond their expected boundaries by using feedback from your service and its volume to affect your sampling rates! Your application knows best how the traffic flowing through it varies; allowing it to decide how to sample the instrumentation can give you the ability to reduce total throughput by an order of magnitude while still maintaining the necessary visibility into the parts of the system that matter most.
I'll finish by bringing up some examples of dynamic sampling in our own infrastructure and talk about how it lets us see individual events of interest while keeping only 1/1000th of the overall traffic.
Ben Hartshorne[node:field-speakers-institution]
Ben Hartshorne has been looking for the needles in a haystack of servers for a decade. He has finally figured out that (with the right kind of needles) magnets can make a huge difference. Observability tools may look different but have mostly kept the same leftover ideas in the last 15 years—it's time to shake it up. Ben joined Honeycomb.io to bring some of the tools the big companies have to the rest of us and help every engineer build better products, sleep better, and resolve problems faster.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Ben Hartshorne},
title = {Sample Your Traffic but Keep the Good Stuff!},
year = {2017},
address = {San Francisco, CA},
publisher = {USENIX Association},
month = oct
}