Katib: A Distributed General AutoML Platform on Kubernetes

Authors: 

Jinan Zhou, Andrey Velichkevich, Kirill Prosvirov, and Anubhav Garg, Cisco Systems; Yuji Oshima, NTT Software Innovation Center; Debo Dutta, Cisco Systems

Abstract: 

Automatic Machine Learning (AutoML) is a powerful mechanism to design and tune models. We present Katib, a scalable Kubernetes-native general AutoML platform that can support a range of AutoML algorithms including both hyper-parameter tuning and neural architecture search. The system is divided into separate components, encapsulated as micro-services. Each micro-service operates within a Kubernetes pod and communicates with others via well-defined APIs, thus allowing flexible management and scalable deployment at a minimal cost. Together with a powerful user interface, Katib provides a universal platform for researchers as well as enterprises to try, compare and deploy their AutoML algorithms, on any Kubernetes platform.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {232981,
author = {Jinan Zhou and Andrey Velichkevich and Kirill Prosvirov and Anubhav Garg and Yuji Oshima and Debo Dutta},
title = {Katib: A Distributed General {AutoML} Platform on Kubernetes},
booktitle = {2019 USENIX Conference on Operational Machine Learning (OpML 19)},
year = {2019},
isbn = {978-1-939133-00-7},
address = {Santa Clara, CA},
pages = {55--57},
url = {https://www.usenix.org/conference/opml19/presentation/zhou},
publisher = {USENIX Association},
month = may
}