Preetha Appan, Indeed.com
At Indeed, we strive to build systems that can withstand problems with an unreliable network. We want to anticipate and prevent failures, rather than just reacting to them. Our applications run on the private cloud, sharing infrastructure with other services on the same host. The interconnectedness of our system and resource infrastructure introduces challenges when inducing failures that simulate a slow or lossy network. We need the ability to slow down the network for one service or data source and test how this impacts other applications that use it—without causing side effects on applications in the same host.
In this talk, we’ll describe Sloth, a Go tool for inducing network failures. Sloth is a daemon that runs on every host in our infrastructure, including database and index servers. Sloth works by adding and removing complex traffic shaping rules via unix’s tc and iptables. Sloth is implemented with access control and audit logging to ensure its usability without compromising security. It provides a web UI for manual testing and offers an API to embed destructive testing into integration tests. We will discuss specific examples of how using Sloth, we discovered and fixed problems in monitoring, graceful degradation, and usability.
Preetha Appan is a principal software engineer at Indeed, and has expertise in building performant distributed systems for recommendations and search. Her past contributions to Indeed's job and resume search engines include text segmentation improvements, query expansion features, and other major infrastructure and performance improvements. She loves the SRE philosophy and is embracing destructive testing by breaking everyone's applications for improving their resilience.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.