Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • FAST '14 Home
  • Conference Organizers
  • Registration Information
    • Registration Discounts
    • Venue, Hotel, and Travel
  • At a Glance
  • Calendar
  • Training Program
  • Technical Sessions
    • WiPs
  • Activities
    • Poster Sessions
    • Birds-of-a-Feather Sessions
  • Sponsorship
  • Students and Grants
  • Services
  • Questions?
  • Help Promote!
  • For Participants
  • Call for Papers
  • Past Conferences

sponsors

Platinum Sponsor
Gold Sponsor
Gold Sponsor
Gold Sponsor
Gold Sponsor
Gold Sponsor
Silver Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
Bronze Sponsor
General Sponsor
General Sponsor
General Sponsor
General Sponsor
General Sponsor
General Sponsor
General Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Media Sponsor
Industry Partner
Industry Partner

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home » Analysis of HDFS Under HBase: A Facebook Messages Case Study
Tweet

connect with us

http://twitter.com/usenix
https://www.facebook.com/pages/USENIX-Association/124487434386
http://www.linkedin.com/groups/USENIX-Association-49559/about
https://plus.google.com/108588319090208187909/posts
http://www.youtube.com/user/USENIXAssociation

Analysis of HDFS Under HBase: A Facebook Messages Case Study

Authors: 

Tyler Harter, University of Wisconsin—Madison; Dhruba Borthakur, Siying Dong, Amitanand Aiyer, and Liyin Tang, Facebook Inc.; Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, University of Wisconsin—Madison

Abstract: 

We present a multilayer study of the Facebook Messages stack, which is based on HBase and HDFS. We collect and analyze HDFS traces to identify potential improvements, which we then evaluate via simulation. Messages represents a new HDFS workload: whereas HDFS was built to store very large files and receive mostly sequential I/O, 90% of files are smaller than 15MB and I/O is highly random. We find hot data is too large to easily fit in RAM and cold data is too large to easily fit in flash; however, cost simulations show that adding a small flash tier improves performance more than equivalent spending on RAM or disks. HBase’s layered design offers simplicity, but at the cost of performance; our simulations show that network I/O can be halved if compaction bypasses the replication layer. Finally, although Messages is read-dominated, several features of the stack (i.e., logging, compaction, replication, and caching) amplify write I/O, causing writes to dominate disk I/O.

Tyler Harter, University of Wisconsin, Madison

Dhruba Borthakur, Facebook, Inc

Siying Dong, Facebook, Inc

Amitanand Aiyer, Facebook, Inc

Liyin Tang, Facebook, Inc

Andrea C. Arpaci-Dusseau, University of Wisconsin, Madison

Remzi H. Arpaci-Dusseau, University of Wisconsin, Madison

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {179858,
author = {Tyler Harter and Dhruba Borthakur and Siying Dong and Amitanand Aiyer and Liyin Tang and Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau},
title = {Analysis of {HDFS} Under {HBase}: A Facebook Messages Case Study},
booktitle = {12th USENIX Conference on File and Storage Technologies (FAST 14)},
year = {2014},
isbn = {ISBN 978-1-931971-08-9},
address = {Santa Clara, CA},
pages = {199--212},
url = {https://www.usenix.org/conference/fast14/technical-sessions/presentation/harter},
publisher = {USENIX Association},
month = feb
}
Download
Harter PDF

Presentation Video 

Presentation Audio

MP3 Download

Download Audio

  • Log in or    Register to post comments

Open access to the FAST '14 Proceedings is sponsored by USENIX and Symantec.

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Bronze Sponsors

General Sponsors

Media Sponsors & Industry Partners

© USENIX

  • Privacy Policy
  • Contact Us