Non-Linear Compression: Gzip Me Not!

Michael F. Nowlan, Bryan Ford, and Ramakrishna Gummadi
Yale University

4th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage '12)
June 13-14, 2012, Boston, MA


Most compression algorithms used in storage systems today are based on an increasingly outmoded sequential processing model. Systems wishing to decompress blocks out-of-order or in parallel must reset the compressor's state before each block, reducing adaptiveness and limiting compression ratios. To remedy this situation, we present Non-Linear Compression, a novel compression model enabling systems to impose an arbitrary partial order on inter-block dependencies. Mutually unordered blocks may be compressed and decompressed out-of-order or in parallel, and a compressor can adaptively compress each block based on all causally prior blocks. This graph structure captures the system's data dependencies explicitly and completely, enabling the compressor to adapt using long-lived state without the constraint of sequential processing. Preliminary experiences with a simple Huffman compressor suggest that non-linear compression fits a diverse set of storage applications.

Paper: PDF

Slides: PDF

This research is sponsored by the National Science Foundation under grants CNS-0916413 and CNS-0916678.