Knockoff: Cheap Versions in the Cloud


Xianzheng Dou, Peter M. Chen, and Jason Flinn, University of Michigan

Note: Due to technical difficulties, there is no audio of this talk.


Cloud-based storage provides reliability and ease-of-management. Unfortunately, it can also incur significant costs for both storing and communicating data, even after using techniques such as chunk-based deduplication and delta compression. The current trend of providing access to past versions of data exacerbates both costs.

In this paper, we show that deterministic recomputation of data can substantially reduce the cost of cloud storage. Borrowing a well-known dualism from the fault-tolerance community, we note that any data can be equivalently represented by a log of the nondeterministic inputs needed to produce that data. We design a file system, called Knockoff, that selectively substitutes nondeterministic inputs for file data to reduce communication and storage costs. Knockoff compresses both data and computation logs: it uses chunk-based deduplication for file data and delta compression for logs of nondeterminism. In two studies, Knockoff reduces the average cost of sending files to the cloud without versioning by 21% and 24%; the relative benefit increases as versions are retained more frequently.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {202325,
author = {Xianzheng Dou and Peter M. Chen and Jason Flinn},
title = {Knockoff: Cheap Versions in the Cloud},
booktitle = {15th USENIX Conference on File and Storage Technologies (FAST 17)},
year = {2017},
isbn = {978-1-931971-36-2},
address = {Santa Clara, CA},
pages = {73--88},
url = {},
publisher = {USENIX Association},
month = feb