Daydream: Accurately Estimating the Efficacy of Optimizations for DNN Training

Authors: 

Hongyu Zhu, University of Toronto & Vector Institute; Amar Phanishayee, Microsoft Research; Gennady Pekhimenko, University of Toronto & Vector Institute

Abstract: 

Modern deep neural network (DNN) training uses a complex software/hardware stack used by machine learning (ML) practitioners are often heterogeneous. The efficacy of software-level optimizations can vary significantly when applied to different configurations. It is onerous and error-prone for ML practitioners and system developers to implement each optimization separately, and determine which ones will improve performance in their own configurations. Unfortunately, existing profiling tools do not aim to answer predictive questions such as "How will optimization X affect the performance of my model?". This paper addresses this critical limitation, and proposes a new profiling tool, Daydream, to help programmers efficiently explore the efficacy of DNN optimizations. Daydream models DNN execution with a fine-grained dependency graph based on low-level traces collected by CUPTI, and predicts runtime by simulating execution based on the dependency graph. Daydream maps the low-level traces using DNN domain-specific knowledge, and introduces a set of graph-transformation primitives that can easily model a wide variety of optimizations. We show that Daydream is able to model most mainstream DNN optimization techniques, and accurately predict the efficacy of optimizations that will result in significant performance improvements.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {254479,
author = {Hongyu Zhu and Amar Phanishayee and Gennady Pekhimenko},
title = {Daydream: Accurately Estimating the Efficacy of Optimizations for {DNN} Training},
booktitle = {2020 USENIX Annual Technical Conference (USENIX ATC 20)},
year = {2020},
isbn = {978-1-939133-14-4},
pages = {337--352},
url = {https://www.usenix.org/conference/atc20/presentation/zhu-hongyu},
publisher = {USENIX Association},
month = jul
}

Presentation Video