Clay Codes: Moulding MDS Codes to Yield an MSR Code


Myna Vajha, Vinayak Ramkumar, Bhagyashree Puranik, Ganesh Kini, Elita Lobo, Birenjith Sasidharan, and P. Vijay Kumar, Indian Institute of Science, Bangalore; Alexandar Barg and Min Ye, University of Maryland; Srinivasan Narayanamurthy, Syed Hussain, and Siddhartha Nandi, NetApp ATG, Bangalore


With increase in scale, the number of node failures in a data center increases sharply. To ensure availability of data, failure-tolerance schemes such as Reed-Solomon (RS) or more generally, Maximum Distance Separable (MDS) erasure codes are used. However, while MDS codes offer minimum storage overhead for a given amount of failure tolerance, they do not meet other practical needs of today's data centers. Although modern codes such as Minimum Storage Regenerating (MSR) codes are designed to meet these practical needs, they are available only in highly-constrained theoretical constructions, that are not sufficiently mature enough for practical implementation. We present {\em Clay codes} that extract the best from both worlds. Clay (short for Coupled-Layer) codes are MSR codes that offer a simplified construction for decoding/repair by using pairwise coupling across multiple stacked layers of any single MDS code. In addition, Clay codes provide the first practical implementation of an MSR code that offers (a) low storage overhead, (b) simultaneous optimality in terms of three key parameters: repair bandwidth, sub-packetization level and disk I/O, (c) uniform repair performance of data and parity nodes and (d) support for both single and multiple-node repairs, while permitting faster and more efficient repair.

While all MSR codes are vector codes, none of the distributed storage systems support vector codes. We have modified Ceph to support any vector code, and our contribution is now a part of Ceph's master codebase. We have implemented Clay codes, and integrated it as a plugin to Ceph. Six example Clay codes were evaluated on a cluster of Amazon EC2 instances and code parameters were carefully chosen to match known erasure-code deployments in practice. A particular example code, with storage overhead $1.25$x, is shown to reduce repair network traffic by a factor of $2.9$ in comparison with RS codes and similar reductions are obtained for both repair time and disk read.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {210530,
author = {Myna Vajha and Vinayak Ramkumar and Bhagyashree Puranik and Ganesh Kini and Elita Lobo and Birenjith Sasidharan and P. Vijay Kumar and Alexandar Barg and Min Ye and Srinivasan Narayanamurthy and Syed Hussain and Siddhartha Nandi},
title = {Clay Codes: Moulding {MDS} Codes to Yield an {MSR} Code},
booktitle = {16th USENIX Conference on File and Storage Technologies (FAST 18)},
year = {2018},
isbn = {978-1-931971-42-3},
address = {Oakland, CA},
pages = {139--154},
url = {},
publisher = {USENIX Association},
month = feb