Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O

Authors: 
Zhiyuan Ai, Mingxing Zhang, and Yongwei Wu, Department of Computer Science and Technology, Tsinghua National Laboratory for Information Science and Technology (TNLIST), Tsinghua University and Research Institute of Tsinghua; Xuehai Qian, University of Southern California; Kang Chen and Weimin Zheng, Department of Computer Science and Technology, Tsinghua National Laboratory for Information Science and Technology (TNLIST), Tsinghua University, and Research Institute of Tsinghua
Abstract: 

The current primary concern of out-of-core graph processing systems is improving disk I/O locality, which leads to certain restrictions on their programming and execution models. Although improving the locality, these constraints also restrict the expressiveness. As a result, only sub-optimal algorithms are supported for many kinds of applications. When compared with the optimal algorithms, these supported algorithms typically incur sequential, but much larger, amount of disk I/O.

In this paper, we explore a fundamentally different tradeoff: less total amount of I/O rather than better locality. We show that out-of-core graph processing systems uniquely provide the opportunities to lift the restrictions of the programming and execution model (e.g., process each loaded block at most once, neighborhood constraint) in a feasible manner, which enable efficient algorithms that require drastically less number of iterations. To demonstrate the ideas, we build CLIP, a novel out-ofcore graph processing system designed with the principle of “squeezing out all the value of loaded data”. With the more expressive programming model and more flexible execution, CLIP enables more efficient algorithms that require much less amount of total disk I/O. Our experiments show that the algorithms that can be only implemented in CLIP are much faster than the original disk-locality-optimized algorithms in many real-world cases (up to tens or even thousands of times speedup).

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {203149,
author = {Zhiyuan Ai and Mingxing Zhang and Yongwei Wu and Xuehai Qian and Kang Chen and Weimin Zheng},
title = {Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O},
booktitle = {2017 {USENIX} Annual Technical Conference ({USENIX} {ATC} 17)},
year = {2017},
isbn = {978-1-931971-38-6},
address = {Santa Clara, CA},
pages = {125--137},
url = {https://www.usenix.org/conference/atc17/technical-sessions/presentation/ai},
publisher = {{USENIX} Association},
month = jul,
}

Presentation Audio