Stripe uses our AWS invoice to gain observability into our cloud infrastructure spend: it takes a Redshift cluster and enough SQL queries to power 1000 homes in Portland. We run our infrastructure on AWS because their services enable rapid prototyping and the cloud’s elasticity enables immense scale. Our efficiency engineering and finance teams are tasked with analyzing and optimizing the system that emerges from this flexibility. Optimizing cloud spend requires iterating on our code, infrastructure, observability, and organizational processes.
In this talk, I will explain how we added observability to our AWS infrastructure using the Cost & Usage report and custom reporting. I will show how our custom reports led to cost optimizations through internal scoreboarding, alerting using SignalFX, and enabling teams to independently assess the cost of deploying infrastructure. I will outline the impact that reserved instances, CI autoscaling, and cross-AZ network traffic minimization had on our costs. I will show that it is feasible to reduce AWS costs up to 50% by operating a reserved instance strategy and discuss how we created incentives in our engineering organization to keep costs optimized.
Ryan founded and led Stripe's Efficiency Engineering team, which focuses on improving the efficiency and rigor of infrastructure decisions through data reporting and tooling. Ryan has worked on ETLs for cost attribution, infrastructure for capacity forecasting, and observability for costs, all to help find areas to optimize Stripe's cloud. These systems provide visibility to engineers, product, and leadership and are used to drive organizational initiatives.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.