Gabe Krabbe, Google
This 20-minute talk intends to fill in some of the gap between "you need good SLIs" and "the code increments a counter": what exactly should be gathered, for which purpose? There will be concrete examples for good data to gather and export, so that Prometheus, Nagios, Opencensus and their friends and relatives provide useful information instead of distracting noise and misleading lies.
Gabe Krabbe, Google
Gabe Krabbe has been a Site Reliability Engineer at Google for over 14 years. He has worked on, and sometimes against, multiple generations of the Ads management and serving infrastructure. Before joining Google, he worked for various companies as a system administrator. Gabe frequently tells his servers and his children that he doesn't care who started it because it takes two to fight.
author = {Gabe Krabbe},
title = {Practical Instrumentation for Observability},
year = {2019},
address = {Singapore},
publisher = {USENIX Association},
month = jun
}