Yak: A High-Performance Big-Data-Friendly Garbage Collector


Khanh Nguyen, Lu Fang, Guoqing Xu, and Brian Demsky; University of California, Irvine; Shan Lu, University of Chicago; Sanazsadat Alamian, University of California, Irvine; Onur Mutlu, ETH Zurich


Most “Big Data” systems are written in managed languages, such as Java, C#, or Scala. These systems suffer from severe memory problems due to the massive volume of objects created to process input data. Allocating and deallocating a sea of data objects puts a severe strain on existing garbage collectors (GC), leading to high memory management overheads and reduced performance.

This paper describes the design and implementation of Yak, a “Big Data” friendly garbage collector that provides high throughput and low latency for all JVM-based languages. Yak divides the managed heap into a control space (CS) and a data space (DS), based on the observation that a typical data-intensive system has a clear distinction between a control path and a data path. Objects created in the control path are allocated in the CS and subject to regular tracing GC. The lifetimes of objects in the data path often align with epochs creating them. They are thus allocated in the DS and subject to region-based memory management. Our evaluation with three large systems shows very positive results.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {199321,
author = {Khanh Nguyen and Lu Fang and Guoqing Xu and Brian Demsky and Shan Lu and Sanazsadat Alamian and Onur Mutlu},
title = {Yak: A {High-Performance} {Big-Data-Friendly} Garbage Collector},
booktitle = {12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)},
year = {2016},
isbn = {978-1-931971-33-1},
address = {Savannah, GA},
pages = {349--365},
url = {https://www.usenix.org/conference/osdi16/technical-sessions/presentation/nguyen},
publisher = {USENIX Association},
month = nov

Presentation Audio