Circonus: Design (Failures) Case Study

Wednesday, 2018, August 29 - 09:0009:40

Theo Schlossnagle and Heinrich Hartmann, Circonus

Abstract: 

The Circonus platform is a telemetry (time-series) ingest, storage, and analysis platform that provides engineers with tooling to manage systems via SLOs. As SREs, we use SLOs to manage Circonus. Herein lie some interesting recursive lessons. This talk will detail the systems architecture from inception to current day including a migration from bare-metal to Google Cloud. Along this path have been many crimes against computing. I will talk specifically about the architectural evolution as punctuated by my failure.

Theo Schlossnagle, Circonus

The Founder/CEO of Circonus, Theo Schlossnagle is a practicing software engineer and serial entrepreneur. At Johns Hopkins University he earned undergraduate and graduate degrees in computer science, with a focus on graphics and randomized algorithms in distributed systems. Theo founded four technology startups focusing on large systems scalability and distributed systems. He is a Distinguished Member of the ACM and sits on the ACM Practitioners Board and serves as co-chair for the ACM Queue.

Heinrich Hartmann, Circonus

Heinrich Hartmann is the Analytics Lead at Circonus. He is driving the development of analytics methods that transform monitoring data into actionable information as part of the Circonus monitoring platform. In his prior life, Heinrich pursued an academic career as a mathematician. Later he transitioned into computer science and worked as consultant for a number of different companies and research institutions.

BibTeX
@inproceedings {218915,
author = {Theo Schlossnagle and Heinrich Hartmann},
title = {Circonus: Design (Failures) Case Study},
booktitle = {SREcon18 Europe/Middle East/Africa (SREcon18 Europe)},
year = {2018},
address = {Dusseldorf},
url = {https://www.usenix.org/node/218916},
publisher = {{USENIX} Association},
}