RESAR: Reliable Storage at Exabyte Scale

Appeared in Proceedings of the 24th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2016).

Abstract

Stored data needs to be protected against device failure and irrecoverable sector read errors, yet doing so at exabyte scale can be challenging given the large number of failures that must be handled. We have developed RESAR (Robust, Efficient, Scalable, Autonomous, Reliable) storage, an approach to storage system redundancy that only uses XOR-based parity and employs a graph to lay out data and parity. The RESAR layout offers greater robustness and higher flexibility for repair at the same overhead as a declustered version of RAID 6. For instance, a RESAR-based layout with 16 data disklets per stripe has about 50 times lower probability of suffering data loss in the presence of a fixed number of failures than a corresponding RAID 6 organization. RESAR uses a layer of virtual storage elements to achieve better manageability, a broader potential for energy savings, as well as easier adoption of heterogeneous storage devices.

Publication date:
September 2016

Authors:
Thomas Schwarz
Ahmed Amer
Thomas Kroeger
Ethan L. Miller
Darrell D. E. Long
Jehan-François Pâris

Projects:
Reliable Storage
Ultra-Large Scale Storage

Available for download:

Full text:
Download as PDF

Bibtex entry

@inproceedings{schwarz16-mascots,
  author       = {Thomas Schwarz and Ahmed Amer and Thomas Kroeger and Ethan L.
Miller and Darrell D. E. Long and Jehan-François
Pâris},
  title        = {RESAR: Reliable Storage at Exabyte Scale},
  booktitle = {Proceedings of the 24th IEEE International Symposium on Modeling,
Analysis, and Simulation of Computer and Telecommunication
Systems (MASCOTS 2016)},
  month        = sep,
  year         = {2016},
}
Last modified 21 Sep 2016