Panel: Observability

Note: Presentation times are in Coordinated Universal Time (UTC).

Tuesday, 12 October, 2021 - 17:4518:30

Moderator: Daria Barteneva, Microsoft

Panelists: Liz Fong-Jones, honeycomb.io; Gabe Wishnie, Microsoft; Štěpán Davidovič, Google; Richard Waid, LinkedIn; Partha Kanuparthy, Facebook

Abstract: 

In this panel on Observability we will discuss with a few industry experts their thoughts on what are the big questions and challenges in this field, what have been the significant changes in the past few years, and, finally, what next?

Daria Barteneva, Microsoft

Daria Barteneva is a Principal Site Reliability Engineer in Observability Engineering in Azure. With a background in Applied Mathematics, Artificial Intelligence, and Music, Daria is passionate about machine learning, diversity in tech, and opera. In her current role, Daria is focused on changing organizational culture, processes, and platforms to improve service reliability and on-call experience.

Liz Fong-Jones, honeycomb.io

Liz is a developer advocate, labor and ethics organizer, and Site Reliability Engineer (SRE) with 16+ years of experience. She is an advocate at Honeycomb for the SRE and Observability communities, and previously was an SRE working on products ranging from the Google Cloud Load Balancer to Google Flights.

Gabe Wishnie, Microsoft

Gabe Wishnie is a Partner Engineering Manager at Microsoft. He leads teams responsible for both the metrics and distributed tracing capabilities for Microsoft. The products are utilized across the company for large internal workloads and externally by Azure customers.

Štěpán Davidovic, Google

Štěpán Davidovič is a site reliability engineer at Google. He currently works on internal infrastructure for automatic monitoring. In previous Google SRE roles, he developed Canary Analysis Service and has worked on both a wide range of shared infrastructure projects and AdSense reliability. He obtained his bachelor's degree from Czech Technical University, Prague, in 2010.

Richard Waid, LinkedIn

Richard Waid is the Director of Monitoring Infrastructure at LinkedIn, encompassing emission and storage of time series telemetry, as well as alerting through triage, auto-remediation, and notifications. In addition, he is leading the team to automate the migration of LinkedIn to Azure.

Partha Kanuparthy, Facebook

Partha Kanuparthy is a Software Engineer in the Monitoring area at Facebook. His work covers the overall Observability space: scalable systems for logs, relational data, traces, metrics, events, and metadata; leveraging them for real-time automated analyses and data interfaces; and domain-specific observability use cases.

SREcon21 Open Access Sponsored by Indeed

BibTeX
@conference {276665,
author = {Daria Barteneva and Liz Fong-Jones and Gabe Wishnie and {\v S}t{\v e}p{\'a}n Davidovi{\v c} and Richard Waid and Partha Kanuparthy},
title = {Panel: Observability},
year = {2021},
publisher = {USENIX Association},
month = oct
}

Presentation Video