You are here
ClusterOn: Building Highly Configurable and Reusable Clustered Data Services Using Simple Data Nodes
Ali Anwar and Yue Cheng, Virginia Polytechnic Institute and State University; Hai Huang, IBM T. J. Watson Research Center; Ali R. Butt, Virginia Polytechnic Institute and State University
The growing variety of data storage and retrieval needs is driving the design and development of an increasing number of distributed storage applications such as keyvalue stores, distributed file systems, object stores, and databases. We observe that, to a large extent, such applications would implement their own way of handling features of data replication, failover, consistency, cluster topology, leadership election, etc. We found that 45– 82% of the code in six popular distributed storage applications can be classified as implementations of such common features. While such implementations allow for deeper optimizations tailored for a specific application, writing new applications to satisfy the ever-changing requirements of new types of data or I/O patterns is challenging, as it is notoriously hard to get all the features right in a distributed setting.
In this paper, we argue that for most modern storage applications, the common feature implementation (i.e., the distributed part) can be automated and offloaded, so developers can focus on the core application functions. We are designing a framework, ClusterOn, which aims to take care of the
messy plumbingof distributed storage applications. The envisioned goal is that a developer simply “drops” a non-distributed application into ClusterOn, which will convert it into a scalable and highly configurable distributed application.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.