Improving Machine Learning Development Reliability

Thursday, December 08, 2022 - 1:20 pm2:20 pm AEDT

Brian Hansen and Yan Yan, Meta


The Machine learning Development LifeCycle is not the same as Software Development LifeCycle. It’s so different that we believe that we need to develop new ways to rationalize how we go about building, monitoring and alerting on ML artifacts as they go through the process. This talk explores those differences. It highlights challenges of ML reliability and scalability, what we’ve done and the need for involvement from this community to evolve how we think about the development and productization of machine learning as it explodes across our industry.

Brian Hansen, Meta

Brian Hansen Brian leads the AdsML Production Engineering teams for Meta, focused on scaling machine learning in production environments. He has been a successful serial entrepreneur for two decades taking multiple start-ups from early to late stage growth. Throughout his career Brian has been a leader building global teams leveraging infrastructure to improve business performance.

Yan Yan, Meta

Yan Yan is a production engineer within the AdsML PE: Model Ecosystem team. She’s defining and building a Machine Learning Development Lifecycle with partners across Meta. Yan has been a speaker at the 2019 & 2020 USENIX Operational Machine Learning conferences and the 2018 & 2019 Meta PE Summit. Prior to Meta, she graduated from UCLA with a Master’s degree in computer science.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@conference {284917,
author = {Brian Hansen and Yan Yan},
title = {Improving Machine Learning Development Reliability},
year = {2022},
address = {Sydney},
publisher = {USENIX Association},
month = dec

Presentation Video