Skip to main content
Back to USENIX
  • Conferences
  • Students
Sign in
  • Home
  • Attend
    • Registration Information
    • Registration Discounts
    • Venue, Hotel, and Travel
    • Students and Grants
  • Program
    • At a Glance
    • Symposium Program
    • 2nd Workshop on Security Information Workers
    • Who Are You?! Adventures in Authentication
    • Workshop on Privacy Indicators
    • Workshop on Security Fatigue
    • Workshop on the Future of Privacy Notices and Indicators: Will Drones Deliver My Privacy Policy?
  • Activities
    • Poster Session
    • Birds-of-a-Feather Sessions
  • Sponsorship
  • Participate
    • Instructions for Authors and Speakers
    • Call for Nominations
    • Call for Papers
    • Call for Posters and Proposals
      • Call for Papers: 2nd Workshop on Security Information Workers
      • Call for Papers: Who are you?! Adventures in Authentication
      • Call for Papers: Workshop on Privacy Indicators
      • Call for Papers: Workshop on Security Fatigue
      • Workshop: Will Drones Deliver My Privacy Policy?
  • About
    • Organizers
    • Past Symposia
  • Home
  • Attend
  • Program
  • Sponsorship
  • Participate
  • About

sponsors

Gold Sponsor
Gold Sponsor
Gold Sponsor
Silver Sponsor
Bronze Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner

help promote

HotCloud '16 button

USENIX Conference Policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

Quartet: Harmonizing Task Scheduling and Caching for Cluster Computing

Francis Deslauriers, Peter McCormick, George Amvrosiadis, Ashvin Goel, and Angela Demke Brown, University of Toronto

Cluster computing frameworks such as Apache Hadoop and Apache Spark are commonly used to analyze large data sets. The analysis often involves running multiple, similar queries on the same data sets. This data reuse should improve query performance, but we find that these frameworks schedule query tasks independently of each other and are thus unable to exploit the data sharing across these tasks. We present Quartet, a system that leverages information on cached data to schedule together tasks that share data. Our preliminary results are promising, showing that Quartet can increase the cache hit rate of Hadoop and Spark jobs by up to 54%. Our results suggest a shift in the way we think about job and task scheduling today, as Quartet is expected to perform better as more jobs are dispatched on the same data.

Francis Deslauriers, University of Toronto

Peter McCormick, University of Toronto

George Amvrosiadis, University of Toronto

Ashvin Goel, University of Toronto

Angela Demke Brown, University of Toronto

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {196370,
author = {Francis Deslauriers and Peter McCormick and George Amvrosiadis and Ashvin Goel and Angela Demke Brown},
title = {Quartet: Harmonizing Task Scheduling and Caching for Cluster Computing},
booktitle = {8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 16)},
year = {2016},
address = {Denver, CO},
url = {https://www.usenix.org/conference/hotstorage16/workshop-program/presentation/deslauriers},
publisher = {USENIX Association},
month = jun
}
Download
Deslauriers PDF
View the slides
  • Log in or register to post comments

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Media Sponsors & Industry Partners

© USENIX
EIN 13-3055038

  • Privacy Policy
  • Contact Us