Gathering all kinds of telemetry data is key to operating reliable distributed systems at scale. Once you have set up your monitoring systems and recorded all relevant data, the challenge becomes to make sense of it and extract valuable information. Some key questions become:
- How to interpret the telemetry data that is emitted from the systems you are running?
- How to measure the quality of APIs you provide and consume?
- How to aggregate metrics from single nodes to service-level views?
In this workshop we will address these questions with statistical methods like: data visualisation, averages, percentiles, outlier-analysis, histograms, regressions, robustness, and mergeability. We will cover the material from a theoretical and a practical perspective. Bring pen and paper and a laptop!
Heinrich Hartmann is the Analytics Lead at Circonus. He is driving the development of analytics methods that transform monitoring data into actionable information as part of the Circonus monitoring platform. In his prior life, Heinrich pursued an academic career as a mathematician. Later he transitioned into computer science and worked as consultant for a number of different companies and research institutions.