Muskan Prajapati and Renisha Fernandes, VMware
In today’s technology-driven world, efficiency is crucial in all aspects of site reliability engineering. Analytical methods play a vital role in achieving efficiency by identifying areas for improvement and optimizing various systems. Join us in out talk where we will discuss three services developed by our team to improve site reliability engineering efficiency using analytics: Outage Management Service (OMS), a Slackbot, and Service Analytics. OMS automatically detects and resolves outages by analyzing past incidents, while the Slackbot predicts solutions based on past conversations. Service Analytics uses event data collection to generate reports for improving user engagement. These services significantly reduce Mean Time to Repair and alleviate on-call engineers’ burden, resulting in improved efficiency and productivity.
Muskan Prajapati, VMware
Muskan Prajapati has 3+ years of experience as a full stack developer and a year of experience as an SRE, she has been passionate about ensuring code quality and scalability. Currently exploring the field of SRE, she is enthusiastic about learning scaling techniques and delivering exceptional user experiences.
Renisha Fernandes, VMware
Renisha Fernandes has been into software development for the past 10 years, contributing to both backend and front end development. For the past 5 years, she has been contributing to the development and scaling of the automation platform which is actively being used by VMware VMC SRE. She likes playing around with Distributed Systems Design and effective scaling.
author = {Muskan Prajapati and Renisha Fernandes},
title = {Leveraging Analytics for Technical Efficiency and Enhanced User Experience},
year = {2023},
address = {Singapore},
publisher = {USENIX Association},
month = jun
}