Networks for SREs: What Do I Need to Know for Troubleshooting Applications

Wednesday, 30 August, 2017 - 13:4014:30

Michael Kehoe, LinkedIn


All of us depend on the underlying network to be stable whether in the datacenter or in the cloud. We all have a basic knowledge of how traditional networks run, however in the past 10 years, we’ve moved to building redundant physical topologies in our networks, optimized the routing methodologies accordingly, moved into the cloud and gotten greater visibility and tuneables in the Linux kernel network stack. A lot has changed!

However, the way we troubleshoot the network in relation to the applications we support hasn’t adapted. In this session, we’ll review the progress that network infrastructure has made look at specific examples where traditional troubleshooting responses fail us and demonstrate our need to rethink our approach to making applications and the network interact harmoniously.

Michael Kehoe, LinkedIn

Michael Kehoe, Staff Site Reliability Engineer in the Production-SRE team, joined the LinkedIn operations team as a New College Graduate in January 2014. Prior to that, Michael studied Engineering at the University of Queensland (Australia) where he majored in Electrical Engineering. During his time studying, he interned at NASA Ames Research Center working on the PhoneSat project.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@conference {205476,
author = {Michael Kehoe},
title = {Networks for {SREs}: What Do I Need to Know for Troubleshooting Applications},
year = {2017},
address = {Dublin},
publisher = {USENIX Association},
month = aug