Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Home
  • Attend
    • Registration Information
    • Registration Discounts
    • Venue, Hotel, and Travel
    • Students and Grants
    • Co-located Events
      • SOUPS 2016
      • HotCloud '16
      • HotStorage '16
  • Program
    • At a Glance
    • Technical Sessions
  • Activities
    • Birds-of-a-Feather Sessions
    • Poster Session
  • Participate
    • Instructions for Authors and Speakers
    • Call for Papers
    • Call for Practitioner Talks
  • Sponsorship
  • About
    • Organizers
    • Help Promote!
    • Questions
    • Past Conferences
  • Home
  • Attend
  • Program
  • Activities
  • Participate
  • Sponsorship
  • About

sponsors

Gold Sponsor
Gold Sponsor
Gold Sponsor
Gold Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Silver Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner
Industry Partner
Industry Partner

help promote

USENIX ATC '16

Get
Help Promote graphics!

connect with us


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home » Elfen Scheduling: Fine-Grain Principled Borrowing from Latency-Critical Workloads Using Simultaneous Multithreading
Tweet

connect with us

Elfen Scheduling: Fine-Grain Principled Borrowing from Latency-Critical Workloads Using Simultaneous Multithreading

Authors: 

Xi Yang and Stephen M. Blackburn, Australian National University; Kathryn S. McKinley, Microsoft Research

Abstract: 

Web services from search to games to stock trading impose strict Service Level Objectives (SLOs) on tail latency. Meeting these objectives is challenging because the computational demand of each request is highly variable and load is bursty. Consequently, many servers run at low utilization (10 to 45%); turn off simultaneous multithreading (SMT); and execute only a single service—wasting hardware, energy, and money. Although co-running batch jobs with latency critical requests to utilize multiple SMT hardware contexts (lanes) is appealing, unmitigated sharing of core resources induces non-linear effects on tail latency and SLO violations.

We introduce principled borrowing to control SMT hardware execution in which batch threads borrow core resources. A batch thread executes in a reserved batch SMT lane when no latency-critical thread is executing in the partner request lane. We instrument batch threads to quickly detect execution in the request lane, step out of the way, and promptly return the borrowed resources. We introduce the nanonap system call to stop the batch thread’s execution without yielding its lane to the OS scheduler, ensuring that requests have exclusive use of the core’s resources. We evaluate our approach for colocating batch workloads with latency-critical requests using the Apache Lucene search engine. A conservative policy that executes batch threads only when request lane is idle improves utilization between 90% and 25% on one core depending on load, without compromising request SLOs. Our approach is straightforward, robust, and unobtrusive, opening the way to substantially improved resource utilization in datacenters running latency-critical workloads.

Xi Yang, Australian National University

Stephen M. Blackburn, Australian National University

Kathryn S. McKinley, Microsoft Research

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {196288,
author = {Xi Yang and Stephen M. Blackburn and Kathryn S. McKinley},
title = {Elfen Scheduling: {Fine-Grain} Principled Borrowing from {Latency-Critical} Workloads Using Simultaneous Multithreading},
booktitle = {2016 USENIX Annual Technical Conference (USENIX ATC 16)},
year = {2016},
isbn = {978-1-931971-30-0},
address = {Denver, CO},
pages = {309--322},
url = {https://www.usenix.org/conference/atc16/technical-sessions/presentation/yang},
publisher = {USENIX Association},
month = jun,
}
Download
Yang PDF
View the slides

Presentation Audio

MP3 Download

Download Audio

  • Log in or    Register to post comments

Gold Sponsors

Silver Sponsors

Media Sponsors & Industry Partners

© USENIX

  • Privacy Policy
  • Contact Us