RSS feed RSS: Events | News | Papers

News

Events

››› Complete list of events

Deduplication

Description

The Storage Systems Research Center is a pioneer in the use of deduplication to consolidate storage by eliminating references to duplicate and near-duplicate data. Deduplication reduces the total amount of storage needed to store information by matching data sequences in new data with identical or similar sequences in already-stored data and storing references to the existing data in place of the new data. However, while this approach can reduce the demand for storage, it introduces new problems. The system must be able to efficiently locate duplicates, a non-trivial task when the storage system contains terabytes to petabytes of data. The system must also restore some of the redundancy removed by deduplication; failing to do so can result in large volumes of data being lost if a single key chunk of data fails. Deduplication potentially reduces performance by spreading data across disk, and can result in security issues.

Status

The Deep Store project initially explored the use of deduplication for archival systems, and we are continuing to explore additional issues such as performance, indexing, reliability, and security for deduplicated data.

Publications

2009

2008

2007

2006

2005

2004


Last modified 11 Sep 2009
Home | Research | People | Publications | Seminars | Sponsors
  Site powered by Django