RSS feed RSS: Events | News | Papers

News

No recent news at this time.

››› Complete list of news items

Events

››› Complete list of events

Archival Storage

Description

We have several active and past projects in archival storage, all of which are contributing to the ability to build more efficient, reliable, and secure long-term storage systems. In addition, we maintain a wiki page with links to resources on archival storage systems.

  • Archival Workload Studies: We have produced several detailed studies of archival storage user behavior and system evolution. Our studies provide relevant, up-to-date observations on archival system usage patterns to guide and validate future archival storage designs. Some of the key results we've found include weakening oft-quoted "Write-Once, Read-Maybe" assumption, and identifying that the vast majority of archival traffic comes from purely automated sources.
  • Improving Trace Analysis: Our experiences with analyzing long-term traces have highlighted shortcoming in current tracing and analysis techniques. We are using our experience to design new techniques and "best practices" to improve future traces and analyses, such as using traces and metadata snapshots to improve understanding of system state over time, and techniques for discerning between logger failures and full system crashes when activity rates appear unusually low.
  • Economic Modeling of Long-Term Storage: One of the most pressing current issues in archival storage is understanding what will influence the long-term total cost of operation (TCO) for storing data for decades or longer. Factors that influence the TCO include electricity, labor, shifting media costs, as well as disasters. An insufficiently funded archive may safely store data only to run out of funds at a critical juncture, such as a media cost spike like that incurred by the 2011 Thailand floods, and slowing growth of HD densities. Using a series of models and simulations we aim to explore factors that influence the long-term costs and survival of archives.
  • Status

  • Archival Workload Studies: We have recently completed and published several studies of both private and public historical and scientific archives, and are looking towards analysis of a newer dataset obtained from the US Library of Congress.
  • Improving Trace Analysis:In this project we are in the midst of initial proof of concept simulations and analysis, creating artificial snapshots and workloads to better understand the strengths and limitations of our proposed techniques
  • Economic Modeling of Long-Term Storage: We have completed a working discrete event simulator, and are exploring a variety of questions. For example, what is the impact of increased device lifetime in scenarios with low overall device density growth?
  • Past Projects: The following are projects we have worked on in the past
    Logan: A management system to scalably grow, maintain, and evolve a heterogeneous archival storage system
    Computation-Storage Trade-off: Using provenance to reduce storage overhead by storing intermediate and initial inputs and recomputing a dataset on demand
    Pergamum: long-term evolvable storage built from intelligent network-attached bricks with both disk and NVRAM such as flash.
    Deep Store: building more efficient archival storage using deduplication to take advantage of intra-file and inter-file redundancy.
    POTSHARDS: long-term secure storage, which allows the secure preservation of data for decades without relying upon traditional encryption to prevent information leakage.

Publications

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

1998


Last modified 17 Oct 2012
Home | Research | People | Publications | Seminars | Sponsors
  Site powered by Django