You are here
Towards Hybrid Programming in Big Data
Peng Wang, Chinese Academy of Sciences; Hong Jiang, University of Nebraska–Lincoln; Xu Liu, College of William and Mary; Jizhong Han, Chinese Academy of Sciences
Within the past decade, there have been a number of parallel programming models developed for data-intensive (i.e., big data) applications. Typically, each model has its own strengths in performance or programmability for some kinds of applications but limitations for others. As a result, multiple programming models are often combined in a complimentary manner to exploit their merits and hide their weaknesses. However, existing models can only be loosely coupled due to their isolated runtime systems.
In this paper, we present Transformer, the first system that supports hybrid programming models for data-intensive applications. Transformer has two unique contributions. First, Transformer offers a programming abstraction in a unified runtime system for different programming model implementations, such as Dryad, Spark, Pregel, and PowerGraph. Second, Transformer supports an efficient and transparent data sharing mechanism, which tightly integrates different programming models in a single program. Experimental results on Amazon’s EC2 cloud show that Transformer can flexibly and efficiently support hybrid programming models for data-intensive computing.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.