How (Not) to Scale a Project: A Post-Mortem

Website Maintenance Alert

Due to scheduled maintenance on Wednesday, October 16, from 10:30 am to 4:30 pm Pacific Daylight Time (UTC -7), parts of the USENIX website (e.g., conference registration, user account changes) may not be available. We apologize for the inconvenience.

If you are trying to register for LISA19, please complete your registration before or after this time period.

Thursday, June 13, 2019 - 10:00 am10:30 am

Giacomo Bagnoli, Facebook


This talk is a multi-year retrospective about a real life project, from the initial wins as a proof of concept to the challenges and problems of scaling it up that almost jeopardized it.

A network monitoring tool with initial promising results but that scaled too fast, too soon; how it got a lot of traction with the proof of concept but failed to scale and productionize it; how expectations got dis-aligned with results, and how the customer perceived it afterwards.

This is ultimately a success story on how applying best practices that "we all know" helped rectify a potentially bad situation. We'll go through the history of this project, and how applying such practices to customer communications, software design, system design, and operations excellence got it back on track.

Giacomo Bagnoli, Facebook

Giacomo Bagnoli is a Production Engineer at Facebook in Dublin, where he works on network monitoring tools. Previously at Etsy, Amazon, and various small startups, he has been breaking and fixing systems for more than a decade.

@conference {233205,
author = {Giacomo Bagnoli},
title = {How (Not) to Scale a Project: A Post-Mortem},
year = {2019},
address = {Singapore},
publisher = {{USENIX} Association},
month = jun,

Presentation Video