RStream: Marrying Relational Algebra with Streaming for Efficient Graph Mining on A Single Machine


Kai Wang, UCLA; Zhiqiang Zuo, Nanjing University; John Thorpe, UCLA; Tien Quang Nguyen, Facebook; Guoqing Harry Xu, UCLA


Graph mining is an important category of graph algorithms that aim to discover structural patterns such as cliques and motifs in a graph. While a great deal of work has been done recently on graph computation such as PageRank, systems support for scalable graph mining is still limited. Existing mining systems such as Arabesque focus on distributed computing and need large amounts of compute and memory resources.

We built RStream, the first single-machine, out-of-core mining system that leverages disk support to store intermediate data. At its core are two innovations: (1) a rich programming model that exposes relational algebra for developers to express a wide variety of mining tasks; and (2) a runtime engine that implements relational algebra efficiently with tuple streaming. A comparison between RStream and four state-of-the-art distributed mining/Datalog systems---Arabesque, ScaleMine, DistGraph, and BigDatalog---demonstrates that RStream outperforms all of them, running on a 10-node cluster, e.g., by at least a factor of 1.7, and can process large graphs on an inexpensive machine.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {222571,
author = {Kai Wang and Zhiqiang Zuo and John Thorpe and Tien Quang Nguyen and Guoqing Harry Xu},
title = {{RStream}: Marrying Relational Algebra with Streaming for Efficient Graph Mining on A Single Machine},
booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)},
year = {2018},
isbn = {978-1-939133-08-3},
address = {Carlsbad, CA},
pages = {763--782},
url = {},
publisher = {USENIX Association},
month = oct

Presentation Audio