Caerus: {NIMBLE} Task Scheduling for Serverless Analytics

Hong Zhang; Yupeng Tang; Anurag Khandelwal; Jingrong Chen; Ion Stoica

Hong Zhang, UC Berkeley; Yupeng Tang and Anurag Khandelwal, Yale University; Jingrong Chen, Duke University; Ion Stoica, UC Berkeley

Serverless platforms facilitate transparent resource elasticity and fine-grained billing, making them an attractive choice for data analytics. We find that while server-centric analytics frameworks typically optimize for job completion time (JCT), resource utilization and isolation via inter-job scheduling policies, serverless analytics requires optimizing for JCT and cost of execution instead, introducing a new scheduling problem. We present Caerus, a task scheduler for serverless analytics frameworks that employs a fine-grained NIMBLE scheduling algorithm to solve this problem. NIMBLE efficiently pipelines task executions within a job, minimizing execution cost while being Pareto-optimal between cost and JCT for arbitrary analytics jobs. To this end, NIMBLE models a wide range of execution parameters --- pipelineable and non-piplineable data dependencies, data generation, consumption and processing rates, etc. --- to determine the ideal task launch times. Our evaluation results show that in practice, Caerus is able to achieve both optimal cost and JCT for queries across a wide range of analytics workloads.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {265037,
author = {Hong Zhang and Yupeng Tang and Anurag Khandelwal and Jingrong Chen and Ion Stoica},
title = {Caerus: {NIMBLE} Task Scheduling for Serverless Analytics},
booktitle = {18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21)},
year = {2021},
isbn = {978-1-939133-21-2},
pages = {653--669},
url = {https://www.usenix.org/conference/nsdi21/presentation/zhang-hong},
publisher = {USENIX Association},
month = apr
}

Download

Zhang PDF

View the slides

Caerus: NIMBLE Task Scheduling for Serverless Analytics

Open Access Media

Presentation Video