Heading Off Correlated Failures through Independence-as-a-Service

Ennan Zhai, David Isaac Wolinsky, and Bryan Ford
Yale University

Ruichuan Chen
Bell Labs / Alcatel-Lucent

11th USENIX Symposium on Operating Systems Design and Implementation
October 7, 2014, Broomfield, CO


Today's systems pervasively rely on redundancy to ensure reliability. In complex multi-layered hardware/software stacks, however – especially in the clouds where many independent businesses deploy interacting services on common infrastructure – seemingly independent systems may share deep, hidden dependencies, undermining redundancy efforts and introducing unanticipated correlated failures. Complementing existing post-failure forensics, we propose Independence-as-a-Service (or INDaaS), an architecture to audit the independence of redundant systems proactively, thus avoiding correlated failures. INDaaS first utilizes pluggable dependency acquisition modules to collect the structural dependency information (including network, hardware, and software dependencies) from a variety of sources. With this information, INDaaS then quantifies the independence of systems of interest using pluggable auditing modules, offering various performance, precision, and data secrecy tradeoffs. While the most general and efficient auditing modules assume the auditor is able to obtain all required information, INDaaS can employ private set intersection cardinality protocols to quantify the independence even across businesses unwilling to share their full structural information with anyone. We evaluate the practicality of INDaaS with three case studies via auditing realistic network, hardware, and software dependency structures.

Paper: PDF

This research is sponsored by the National Science Foundation under grants CNS-1017206 and CNS-1149936.