Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Program
  • Program

connect with us


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home » Characteristics of Backup Workloads in Production Systems
Tweet

connect with us

Characteristics of Backup Workloads in Production Systems

Authors: 

Grant Wallace, Fred Douglis, Hangwei Qian, Philip Shilane, Stephen Smaldone, Mark Chamness, and Windsor Hsu, EMC Corporation

Abstract: 

Data-protection class workloads, including backup and long-term retention of data, have seen a strong industry shift from tape-based platforms to disk-based systems. But the latter are traditionally designed to serve as primary storage and there has been little published analysis of the characteristics of backup workloads as they relate to the design of disk-based systems. In this paper, we present a comprehensive characterization of backup workloads by analyzing statistics and content metadata collected from a large set of EMC Data Domain backup systems in production use. This analysis is both broad (encompassing statistics from over 10,000 systems) and deep (using detailed metadata traces from several production systems storing almost 700TB of backup data). We compare these systems to a detailed study of Microsoft primary storage systems [22], showing that backup storage differs significantly from their primary storage workload in the amount of data churn and capacity requirements as well as the amount of redundancy within the data. These properties bring unique challenges and opportunities when designing a disk-based filesystem for backup workloads, which we explore in more detail using the metadata traces. In particular, the need to handle high churn while leveraging high data redundancy is considered by looking at deduplication unit size and caching efficiency.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {266404,
title = {Characteristics of Backup Workloads in Production Systems},
booktitle = {10th USENIX Conference on File and Storage Technologies (FAST 12)},
year = {2012},
address = {San Jose, CA},
url = {https://www.usenix.org/conference/fast12/characteristics-backup-workloads-production-systems},
publisher = {USENIX Association},
month = feb
}
Download
Full paper
Updated full paper, 2/7/12
Updated full paper, 2/9/12

Presentation Video

Presentation Audio

MP3 Download OGG Download

Download Audio

  • Log in or    Register to post comments

© USENIX

  • Privacy Policy
  • Contact Us