OSDI '24 Technical Sessions

Wednesday, July 10

8:00 am–9:00 am

Continental Breakfast

9:00 am–10:00 am

OSDI '24 and USENIX ATC '24 Joint Keynote Address

Scaling AI Sustainably: An Uncharted Territory

Carole-Jean Wu, Meta

The past 50 years has seen a dramatic increase in the amount of compute per person, in particular, those enabled by AI. Despite the positive societal benefits, AI technologies come with significant environmental implications. I will talk about the scaling trend and the operational carbon footprint of AI computing by examining the model development cycle, spanning data, algorithms, and system hardware. At the same time, we will consider the life cycle of system hardware from the perspective of hardware architectures and manufacturing technologies. I will highlight key efficiency optimization opportunities for cutting-edge AI technologies, from deep learning recommendation models to multi-modal generative AI tasks. To scale AI sustainably, we need to make AI and computing more broadly efficient and flexible. We must also go beyond efficiency and optimize across the life cycle of computing infrastructures, from hardware manufacturing to datacenter operation and end-of-life processing for the hardware. Based on the industry experience and lessons learned, my talk will conclude with important development and research directions to advance the field of computing in an environmentally responsible and sustainable manner.

Carole-Jean Wu, Meta

Carole-Jean Wu is a Director at Meta. She is a founding member and a Vice President of MLCommons—a non-profit organization that aims to accelerate machine learning for the benefit of all. Dr. Wu also serves on the MLCommons Board as a Director, chaired the MLPerf Recommendation Benchmark Advisory Board, and co-chaired for MLPerf Inference. Prior to Meta/Facebook, She was a tenured professor at ASU. She earned her M.A. and Ph.D. from Princeton and B.Sc. from Cornell.

Dr. Wu's expertise sits at the intersection of computer architecture and machine learning. Her work spans across datacenter infrastructures and edge systems, such as developing energy- and memory-efficient systems and microarchitectures, optimizing systems for machine learning execution at-scale, and designing learning-based approaches for system design and optimization. Dr. Wu's work has been recognized with several awards, including IEEE Micro Top Picks and ACM/IEEE Best Paper Awards. She was the Program Co-Chair of the Conference on Machine Learning and Systems (MLSys) in 2022, the Program Chair of the IEEE International Symposium on Workload Characterization (IISWC) in 2018, and the Editor for the IEEE MICRO Special Issue on Environmentally Sustainable Computing. She currently serves on the ACM SIGARCH/SIGMICRO CARES committee.

10:00 am–10:30 am

Break with Refreshments

10:30 am–10:45 am

Opening Remarks and Awards

Program Co-Chairs: Ada Gavrilovska, Georgia Institute of Technology; Douglas B. Terry, Amazon Web Services

10:45 am–12:45 pm

Memory Management

Managing Memory Tiers with CXL in Virtualized Environments

Yuhong Zhong, Columbia University; Daniel S. Berger, Microsoft Azure, University of Washington, and CMU; Carl Waldspurger, Carl Waldspurger Consulting; Ishwar Agarwal, Rajat Agarwal, Frank Hady, and Karthik Kumar, Intel; Mark D. Hill, Microsoft Azure; Mosharaf Chowdhury, University of Michigan; Asaf Cidon, Columbia University

12:45 pm–2:00 pm

Conference Luncheon

Sponsored by Roblox

2:00 pm–3:40 pm

Low-Latency LLM Serving

StableGen: Efficient LLM Inference with Low Tail Latency

Amey Agrawal, Georgia Institute of Technology; Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, and Bhargav Gulavani, Microsoft Research; Alexey Tumanov, Georgia Institute of Technology; Ramachandran Ramjee, Microsoft Research

3:40 pm–4:10 pm

Break with Refreshments

4:10 pm–5:30 pm

Distributed Systems

6:00 pm–7:30 pm

OSDI '24 Poster Session and Reception

Sponsored by Amazon

Thursday, July 11

8:00 am–9:00 am

Continental Breakfast

9:00 am–10:40 am

Deep Learning

Enabling Tensor Language Model to Assist in Generating High-Performance Tensor Programs for Deep Learning

Yi Zhai, University of Science and Technology of China; Sijia Yang, Huawei Technologies Co., Ltd.; Keyu Pan, ByteDance Ltd.; Renwei Zhang, Huawei Technologies Co., Ltd.; Shuo Liu, University of Science and Technology of China; Chao Liu and Zichun Ye, Huawei Technologies Co., Ltd.; Jianmin Ji, University of Science and Technology of China; Jie Zhao, Hunan University; Yu Zhang and Yanyong Zhang, University of Science and Technology of China

10:40 am–11:10 am

Break with Refreshments

11:10 am–12:50 pm

Operating Systems

High-throughput and Flexible Host Networking via Control and Data Path Physical Separation

Athinagoras Skiadopoulos, Zhiqiang Xie, and Mark Zhao, Stanford University; Qizhe Cai and Saksham Agarwal, Cornell University; Jacob Adelmann, David Ahern, Carlo Contavalli, Michael Goldflam, Vitaly Mayatskikh, Raghu Raja, and Daniel Walton, Enfabrica; Rachit Agarwal, Cornell University; Shrijeet Mukherjee, Enfabrica; Christos Kozyrakis, Stanford University

12:50 pm–2:00 pm

Conference Luncheon

2:00 pm–3:40 pm

Cloud Computing

ServiceLab: Detecting Tiny Performance Regressions at Hyperscale

Mike Chow, Meta; Yang Wang, The Ohio State University and Meta; William Wang, Ayichew Hailu, Rohan Bopardikar, Bin Zhang, Jialiang Qu, David Meisner, Santosh Sonawane, Yunqi Zhang, Rodrigo Paim, Mack Ward, Ivor Huang, Matt McNally, Daniel Hodges, Zoltan Farkas, Elvis Huang, and Chunqiang Tang, Meta

3:40 pm–4:10 pm

Break with Refreshments

4:10 pm–5:50 pm

Formal Verification

Anvil: Verifying Liveness of Cluster Management Controllers

Xudong Sun, Wenjie Ma, Jiawei Tyler Gu, and Zicheng Ma, University of Illinois Urbana-Champaign; Tej Chajed, University of Wisconsin-Madison; Jon Howell, Andrea Lattuada, and Oded Padon, VMware Research; Lalith Suresh, Feldera; Adriana Szekeres, VMware Research; Tianyin Xu, University of Illinois Urbana-Champaign

6:00 pm–7:30 pm

USENIX ATC '24 Poster Session and Reception

Friday, July 12

8:00 am–9:00 am

Continental Breakfast

9:00 am–10:20 am

Cloud Security

10:20 am–10:50 am

Break with Refreshments

10:50 am–12:10 pm

Data Management

12:10 pm–1:40 pm

Lunch (on your own)

1:40 pm–3:20 pm

Analysis of Correctness

3:20 pm–3:40 pm

Break with Refreshments

3:40 pm–5:20 pm

ML Scheduling

Fairness in Serving Large Language Models

Ying Sheng, Stanford University; Shiyi Cao, Dacheng Li, Banghua Zhu, and Zhuohan Li, UC Berkeley; Danyang Zhuo, Duke University; Joseph Gonzalez and Ion Stoica, UC Berkeley

5:20 pm–5:30 pm

Closing Remarks

Program Co-Chairs: Ada Gavrilovska, Georgia Institute of Technology; Douglas B. Terry, Amazon Web Services