Next Generation of DevOps: AIOps in Practice @Baidu

Monday, May 22, 2017 - 9:55am10:50am

Xianping Qu and Jingjing Ha, Baidu

Abstract: 

Baidu has thousands of applications and hundreds of thousands of servers. For high availability and reliability services, our SREs have developed many operation tools and systems. But, these tools are difficult to reuse and scale because of various of different operations concepts, runtime envoriments and operations strategies. Thus, we built a platform named AIOps platform (AI means automation and intelligence) to help SREs more quickly and efficiently develop operations tools. This platform provides unified operations abstract layer, operations strategies, automated scheduling and execution. Thus, SREs can focus on building their custom and advanced features.

In this talk, we demonstrate the core procedure of AIOps platform by actual cases in the productive environment of the core products at Baidu. The following technologies will be involved and mentioned: the platform architecture, OKB (operations knowledge base), OPAL(operations abstract layer), and practices in failover, auto scaling, etc.

Xianping Qu, Baidu

Xianping Qu is a manager of DevOps team at Baidu, the largest search engine in China, and has built Baidu’s monitoring platform and data warehouse. Now, He leads DevOps team to work on some challenging projects, such as anomaly detection, RCA, auto-scaling, etc. He is also interested in data analysis and machine learning.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

Presentation Audio

BibTeX
@conference {202725,
author = {Xianping Qu and Jingjing Ha},
title = {Next Generation of DevOps: AIOps in Practice @Baidu},
year = {2017},
publisher = {{USENIX} Association},
}