You are here
Load the Edges You Need: A Generic I/O Optimization for Disk-based Graph Processing
Keval Vora, University of California, Riverside; Guoqing Xu, University of California, Irvine; Rajiv Gupta, University of California, Riverside
Single-PC, disk-based processing of big graphs has recently gained much popularity. At the core of an efficient disk-based system is a well-designed partition structure that can minimize random disk accesses. All existing systems use static partitions that are created before processing starts. These partitions have static layouts and are loaded entirely into memory in every single iteration even though much of the edge data is not changed across many iterations, causing these unchanged edges to have zero new impact on the computation of vertex values.
This work provides a general optimization that removes this I/O inefficiency by employing dynamic partitions whose layouts are dynamically adjustable. Our implementation of this optimization in GraphChi — a representative out-of-core vertex-centric graph system — yielded speedups of 1.5—2.8× on six large graphs. Our idea is generally applicable to other systems as well.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.