Katib: A Distributed General AutoML Platform on Kubernetes


Jinan Zhou, Andrey Velichkevich, Kirill Prosvirov, and Anubhav Garg, Cisco Systems; Yuji Oshima, NTT Software Innovation Center; Debo Dutta, Cisco Systems


Automatic Machine Learning (AutoML) is a powerful mechanism to design and tune models. We present Katib, a scalable Kubernetes-native general AutoML platform that can support a range of AutoML algorithms including both hyper-parameter tuning and neural architecture search. The system is divided into separate components, encapsulated as micro-services. Each micro-service operates within a Kubernetes pod and communicates with others via well-defined APIs, thus allowing flexible management and scalable deployment at a minimal cost. Together with a powerful user interface, Katib provides a universal platform for researchers as well as enterprises to try, compare and deploy their AutoML algorithms, on any Kubernetes platform.

