Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Overview
  • Conference Organizers
  • Technical Sessions
  • Co-located Workshops
  • Sponsorship
  • Students and Grants
  • Questions?
  • Help Promote!
  • For Participants
  • Call for Papers

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home ยป Leo: A Profile-Driven Dynamic Optimization Framework for GPU Applications
Tweet

connect with us

http://twitter.com/usenix
http://www.usenix.org/facebook
http://www.usenix.org/linkedin
http://www.usenix.org/gplus
http://www.usenix.org/youtube

Leo: A Profile-Driven Dynamic Optimization Framework for GPU Applications

Authors: 

Naila Farooqui, Georgia Institute of Technology; Christopher Rossbach and Yuan Yu, Microsoft Research; Karsten Schwan, Georgia Institute of Technology

Abstract: 

Parallel architectures like GPUs are a tantalizing compute fabric for performance-hungry developers. While GPUs enable order-of-magnitude performance increases in many data-parallel application domains, writing efficient codes that can actually manifest those increases is a non-trivial endeavor, typically requiring developers to exercise specialized architectural features exposed directly in the programming model. Achieving good performance on GPUs involves effort-intensive tuning, typically requiring the programmer to manually evaluate multiple code versions in search of an optimal combination of problem decomposition with architecture- and runtime-specific parameters. For developers struggling to apply GPUs to more general-purpose computing problems, the introduction of irregular data structures and access patterns serves only to exacerbate these challenges, and only increases the level of effort required.

This paper proposes to automate much of this effort using dynamic instrumentation to inform dynamic, profile-driven optimizations. In this vision, the programmer expresses the application using higher-level front-end programming abstractions such as Dandelion, allowing the system, rather than the programmer, to explore the implementation and optimization space. We argue that such a system is both feasible and urgently needed. We present the design for such a framework, called Leo. For a range of benchmarks, we demonstrate that a system implementing our design can achieve from 1.12 to 27x speedup in kernel runtimes, which translates to 7-40% improvement for end-to-end performance.

Naila Farooqui, Georgia Institute of Technology

Christopher J. Rossbach, Microsoft Research

Yuan Yu, Microsoft Research

Karsten Schwan, Georgia Institute of Technology

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {187024,
author = {Naila Farooqui and Christopher J. Rossbach and Yuan Yu and Karsten Schwan},
title = {Leo: A Profile-Driven Dynamic Optimization Framework for {GPU} Applications},
booktitle = {2014 Conference on Timely Results in Operating Systems ({TRIOS} 14)},
year = {2014},
address = {Broomfield, CO},
url = {https://www.usenix.org/conference/trios14/technical-sessions/presentation/farooqui},
publisher = {{USENIX} Association},
month = oct,
}
Download
Farooqui PDF

Presentation Video

Presentation Audio

MP3 Download OGG Download

Download Audio

  • Log in or    Register to post comments

© USENIX

  • Privacy Policy
  • Conference Policies
  • Contact Us