Vineeth Vadrevu, Yahoo
Yahoo's grid-ops makes frequent changes (15 per-week) to below entities:
- OperatingSystem configuration (packages, security-patches)
- Hadoop (packages)
- Supporting applications i.e. LDAP, Kerberos, Logging, Monitoring
For a platform like Hadoop at Yahoo's scale; stability, reliability and uptime are highly crucial because of the sheer magnitude to which the platform caters to. The objectives that are set, for pushing changes, so that the platform doesn't take a hit on stability, reliability and uptime are:
- Continuous Delivery and DevOps philosophies are strictly adhered to
- Every node always has right configs
- Each change is seen to be pushed within specific time period and is uniform across all nodes
- Changes made can be visualised, tracked, monitored, validated and if required, can be reverted easily
- Feasible mechanisms are made available in the pipeline to promote pushes across Hadoop clusters in a staged manner so that change rollovers are smooth
- Effective gates are in place to reduce the impact of a wrong change
- Any change reviewed and committed makes it's way automatically to every node
This presentation will share and discuss:
- How the objectives are achieved through automation
- Experiences and lessons learned
We believe our practices can easily be adopted by SREs to effectively manage Large-Scale Infrastructure.
Vineeth Vadrevu, Yahoo
Vineeth is a Sr. Production Engineer, Principal at Yahoo. He is with GridOps team at Yahoo working on Large Scale Production Engineering Applications and Infrastructure Monitoring & Management.
Vineeth has B.Tech in Computer Science (Affiliated to Andhra University) and an Executive MBA in Operations Management from Symbiosis Intl. University, Pune, India.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Vineeth Vadrevu},
title = {Managing Changes Seamlessly on Yahoo{\textquoteright}s Hadoop Infrastructure Servers},
year = {2017},
publisher = {USENIX Association},
month = may
}