Predictive {Caching@Scale}

Vaishnav Janardhan; Adit Bhardwaj

Vaishnav Janardhan and Adit Bhardwaj, Akamai Technologies

Increasing content and services available on the Internet has lead to substantial growth in network traffic. A large distributed caching platform facilitate low-latency and high throughput for web and video content over the public Internet. However, the edge-infrastructure cannot grow at the rate of traffic and maintain quality service at low cost. With increasing long-tail content footprints and performance sensitive users, content-agnostic caching schemes fail to evolve with changing traffic popularity profiles which leads to poor caching decisions. At Akamai we built a very high performant, cost sensitive, content aware caching system that uses Machine Learning, to run on our distributed delivery platform. The developed ML-based caching algorithm, Prediction Error-Correcting Caching (PeCC), is cost competitive with a classical algorithm like LRU even when deployed on commodity hardware while achieving cache-hit ratios close to theoretically optimal caching schemes. We will talk about the main challenges and details in deploying PeCC. Introduction to web-traffic and how to build Deep Neural Networks based caching to scale cost-effectively. Second, deployment of a very compute intensive DNN models alongside a real-time web proxy with very tight performance guarantees. We will also discuss some key takeaways from deploying ML for system scalability.

Vaishnav Janardhan is a Principal Architect at Akamai and leads the efforts to use ML techniques to solve performance and scalability challenges on large distributed systems. Vaishnav previously worked on transitioning Akamai’s traditional web delivery platform into a video delivery platform to support the massive growth of the video over the Internet. He has publications and patents on domain-specific file-systems, micro-architectural cpu optimization, tcp congestion control to reduce tail latency of web-traffic, and hierarchical caching. Most recently he worked on re-writing the monolithic web-servers to work on hyper parallel cpu architectures and to support multi-tenant and diverse workload on Akamai platform.

Connect:

@vaishnavj

Adit Bhardwaj is a Senior Software Engineer at Akamai. He is interested in leveraging data to optimize and solve engineering problems through Machine Learning. He received is B.Tech degree in Electrical Engineering and Computer Science from IIT Gandhinagar, India in 2014 and a Master’s degree in Electrical Engineering from UC San Diego in 2017. Most recently, he is developing Machine Learning systems for content aware caching which can scale a large distributed platform. Adit previously worked on constrained convex optimization problem of Low-rank matrix recovery using Lagrangian techniques for image composition problems like HDR imaging.

Connect:

@axd_adit

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@conference {232951,
author = {Vaishnav Janardhan and Adit Bhardwaj},
title = {Predictive {Caching@Scale}},
year = {2019},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = may
}

Download

View the slides

Predictive Caching@Scale

Open Access Media