Quality Gates in Production: How We Turn OpenTelemetry Signals into Deployment Decisions

Thursday, 9 October, 2025 - 09:0009:20

Marcel Birkner, Dash0

Engineering teams constantly balance deployment velocity with system reliability. This challenge is particularly relevant for critical infrastructure like monitoring platforms that need to be more reliable than the systems they observe.

This talk demonstrates how to build automated quality gates that make deployment decisions based on production monitoring data. I'll show a practical implementation using open-source tools including GitHub Actions, ArgoCD, TestContainers, Playwright, and OpenTelemetry that validates deployments before they reach users.

You'll see the actual pipeline in action, including:

  • How we correlate deployment events with error rates, latency, and business metrics using OpenTelemetry traces, logs, and metrics
  • Quality gate criteria that catch regressions across data pipelines and application services
  • Open-source tooling integration that teams can adapt to their environments

This approach is particularly useful for data-heavy and AI workloads where traditional health checks provide limited insight. Using MLFlow for experiment tracking and model management, we've also implemented quality gates that validate model accuracy, detect drift, and verify inference performance before promoting AI services to production.

Key takeaways include practical quality gate patterns you can implement with existing tools, the specific metrics that indicate deployment success, and lessons learned from operating this system in production. Whether you're deploying traditional applications or AI workloads, you'll gain concrete strategies to improve deployment confidence while maintaining development velocity.

Marcel Birkner is a Founding Engineer and Head of Platform at Dash0, where he architects and operates the infrastructure that powers Dash0. With extensive experience as a Site Reliability Engineer at ClickHouse and Instana (IBM), Marcel specializes in building secure and scalable cloud infrastructure.

BibTeX
@conference {315266,
author = {Marcel Birkner},
title = {Quality Gates in Production: How We Turn {OpenTelemetry} Signals into Deployment Decisions},
year = {2025},
address = {Dublin},
publisher = {USENIX Association},
month = oct
}

Presentation Video