Big Data Platforms as a Service: Challenges and Approach

James Horey; Edmon Begoli; Raghul Gunasekaran; Seung-Hwan Lim; James Nutaro

Big Data Platforms as a Service: Challenges and Approach

James Horey, Edmon Begoli, Raghul Gunasekaran, Seung-Hwan Lim, and James Nutaro, Oak Ridge National Laboratory

Infrastructure-as-a-Service has revolutionized the manner in which users commission computing infrastructure. Coupled with Big Data platforms (Hadoop, Cassandra), IaaS has democratized the ability to store and process massive datasets. For users that need to customize or create new Big Data stacks, however, readily available solutions do not yet exist. Users must first acquire the necessary cloud computing infrastructure, and manually install the prerequisite software. For complex distributed services this can be a daunting challenge. To address this issue, we argue that distributed services should be viewed as a single application consisting of virtual machines. Users should no longer be concerned about individual machines or their internal organization. To illustrate this concept, we introduce Cloud-Get, a distributed package manager that enables the simple installation of distributed services in a cloud computing environment. Cloud-Get enables users to instantiate and modify distributed services, including Big Data services, using simple commands. Cloud-Get also simplifies creating new distributed services via standardized package definitions.

James Horey, Oak Ridge National Laboratory

Edmon Begoli, Oak Ridge National Laboratory

Raghul Gunasekaran, Oak Ridge National Laboratory

Seung-Hwan Lim, Oak Ridge National Laboratory

James Nutaro, Oak Ridge National Laboratory

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {181206,
author = {James Horey and Edmon Begoli and Raghul Gunasekaran and Seung-Hwan Lim and James Nutaro},
title = {Big Data Platforms as a Service: Challenges and Approach},
booktitle = {4th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 12)},
year = {2012},
address = {Boston, MA},
url = {https://www.usenix.org/conference/hotcloud12/workshop-program/presentation/horey},
publisher = {USENIX Association},
month = jun
}

USENIX Conference Policies

Big Data Platforms as a Service: Challenges and Approach

James Horey, Oak Ridge National Laboratory

Edmon Begoli, Oak Ridge National Laboratory

Raghul Gunasekaran, Oak Ridge National Laboratory

Seung-Hwan Lim, Oak Ridge National Laboratory

James Nutaro, Oak Ridge National Laboratory

Open Access Media

Presentation Video

Presentation Audio

General Sponsors

sponsors

USENIX Conference Policies

Big Data Platforms as a Service: Challenges and Approach

James Horey, Oak Ridge National Laboratory

Edmon Begoli, Oak Ridge National Laboratory

Raghul Gunasekaran, Oak Ridge National Laboratory

Seung-Hwan Lim, Oak Ridge National Laboratory

James Nutaro, Oak Ridge National Laboratory

Open Access Media

Presentation Video

Presentation Audio

General Sponsors