RSS: Events
|
News
|
Papers
News
No recent news at this time.
››› Complete list of
news items
Events
››› Complete list of events
|
Archival Storage
Faculty
Students
Associates
Alumni
Sponsors
Description
We have several active and past projects in archival storage, all of which are contributing to the ability to
build more efficient, reliable, and secure long-term storage systems. In addition, we maintain a
wiki page with links to resources on archival storage systems.
- Archival Workload Studies: We have produced several detailed studies of archival storage user behavior and system evolution. Our studies provide relevant, up-to-date observations on archival system usage patterns to guide and validate future archival storage designs. Some of the key results we've found include weakening oft-quoted "Write-Once, Read-Maybe" assumption, and identifying that the vast majority of archival traffic comes from purely automated sources.
- Improving Trace Analysis: Our experiences with analyzing long-term traces have highlighted shortcoming in current tracing and analysis techniques. We are using our experience to design new techniques and "best practices" to improve future traces and analyses, such as using traces and metadata snapshots to improve understanding of system state over time, and techniques for discerning between logger failures and full system crashes when activity rates appear unusually low.
- Economic Modeling of Long-Term Storage: One of the most pressing current issues in archival storage is understanding what will influence the long-term total cost of operation (TCO) for storing data for decades or longer. Factors that influence the TCO include electricity, labor, shifting media costs, as well as disasters. An insufficiently funded archive may safely store data only to run out of funds at a critical juncture, such as a media cost spike like that incurred by the 2011 Thailand floods, and slowing growth of HD densities. Using a series of models and simulations we aim to explore factors that influence the long-term costs and survival of archives.
Status
- Archival Workload Studies: We have recently completed and published several studies of both private and public historical and scientific archives, and are looking towards analysis of a newer dataset obtained from the US Library of Congress.
- Improving Trace Analysis:In this project we are in the midst of initial proof of concept simulations and analysis, creating artificial snapshots and workloads to better understand the strengths and limitations of our proposed techniques
- Economic Modeling of Long-Term Storage: We have completed a working discrete event simulator, and are exploring a variety of questions. For example, what is the impact of increased device lifetime in scenarios with low overall device density growth?
- Past Projects: The following are projects we have worked on in the past
Logan: A management system to scalably grow, maintain, and evolve a heterogeneous archival storage system
Computation-Storage Trade-off: Using provenance to reduce storage overhead by storing intermediate and initial inputs and recomputing a dataset on demand
Pergamum: long-term evolvable storage built from intelligent network-attached bricks with both disk and NVRAM such as flash.
Deep Store: building more efficient archival storage using deduplication to take
advantage of intra-file and inter-file redundancy.
POTSHARDS: long-term secure storage, which allows the secure preservation of data for decades without relying upon traditional encryption to prevent information leakage.
Publications
2013
2012
-
Jehan-François Pâris,
Thomas Schwarz,
Ahmed Amer,
Darrell D. E. Long,
Highly Reliable Two-Dimensional RAID Arrays for Archival Storage,
31st IEEE International Performance Computing and Communications Conference,
December 2012.
-
Ian Adams,
Brian Madden,
Joel Frank,
Mark W. Storer,
Ethan L. Miller,
Usage Behavior of a Large-Scale Scientific Archive,
Proceedings of the 2012 International Conference for High Performance Computing, Networking, Storage and Analysis (SC12),
November 2012.
-
David S.H. Rosenthal,
Daniel Rosenthal,
Ethan L. Miller,
Ian Adams,
Mark W. Storer,
Erez Zadok,
The Economics of Long-Term Digital Storage,
The Memory of the World in the Digital Age: Digitization and Preservation,
September 2012.
-
Joel Frank,
Ethan L. Miller,
Ian Adams,
Daniel Rosenthal,
Evolutionary Trends in a Supercomputing Tertiary Storage Environment,
Proceedings of the 20th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2012),
August 2012.
-
Yan Li,
Darrell D. E. Long,
Ethan L. Miller,
Understanding Data Survivability in Archival Storage Systems,
Proceedings of the 5th Annual International Systems and Storage Conference (SYSTOR 2012),
June 2012.
-
Ian Adams,
Mark W. Storer,
Ethan L. Miller,
Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories,
ACM Transactions on Storage 8(2),
May 2012.
-
Avani Wildani,
Ethan L. Miller,
Ohad Rodeh,
HANDS: A Heuristically Arranged Non-Backup In-line Deduplication System,
Technical Report UCSC-SSRC-12-03,
March 2012.
-
Brian Madden,
Ian Adams,
Joel Frank,
Ethan L. Miller,
Analyzing User Behavior: A Trace Analysis of the NCAR Archival Storage System,
Technical Report UCSC-SSRC-ssrctr-12-02,
March 2012.
2011
-
Ian Adams,
Ethan L. Miller,
David S.H. Rosenthal,
Using Storage Class Memory for Archives with DAWN, a Durable Array of Wimpy Nodes ,
Technical Report UCSC-SSRC-11-07,
October 2011.
-
Lawrence You,
Kristal Pollack,
Darrell D. E. Long,
Kanchi Gopinath,
PRESIDIO: A Framework for Efficient Archival Data Storage,
ACM Transactions on Storage 7(2),
July 2011.
-
Yulai Xie,
Kiran-Kumar Muniswamy-Reddy,
Darrell D. E. Long,
Ahmed Amer,
Dan Feng,
Zhipeng Tan,
Compressing Provenance Graphs,
3rd USENIX Workshop on the Theory and Practice of Provenance,
June 2011.
-
Ian Adams,
Ethan L. Miller,
David S.H. Rosenthal,
Using Storage Class Memory for Archives with DAWN, a Durable Array of Wimpy Nodes ,
Technical Report UCSC-SSRC-11-05,
May 2011.
NOTE: This report has been superseded by Technical Report UCSC-SSRC-11-07, please refer to that version.
-
Ian Adams,
Ethan L. Miller,
Mark W. Storer,
Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories,
Technical Report UCSC-SSRC-11-01,
March 2011.
2010
-
Avani Wildani,
Ethan L. Miller,
Semantic Data Placement for Power Management in Archival Storage,
Proceedings of the 5th International Workshop on Petascale Data Storage (PDSW10), held in conjunction with SC2010,
November 2010.
-
Ian Adams,
Ethan L. Miller,
Mark W. Storer,
Examining Energy Use in Heterogeneous Archival Storage Systems,
Proceedings of the 18th Annual Meeting of the IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2010),
August 2010, pages 297-306.
2009
-
Avani Wildani,
Thomas Schwarz,
Ethan L. Miller,
Darrell D. E. Long,
Protecting Against Rare Event Failures in Archival Systems,
Proceedings of the 17th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2009),
September 2009.
-
Mark W. Storer,
Kevin Greenan,
Ethan L. Miller,
Kaladhar Voruganti,
POTSHARDS—A Secure, Long-Term Storage System,
ACM Transactions on Storage 5(2),
June 2009.
-
Ian Adams,
Darrell D. E. Long,
Ethan L. Miller,
Shankar Pasupathy,
Mark W. Storer,
Maximizing Efficiency By Trading Storage for Computation,
Proceedings of the Workshop on Hot Topics in Cloud Computing (HotCloud ’09),
June 2009.
-
Avani Wildani,
Thomas Schwarz,
Ethan L. Miller,
Darrell D. E. Long,
Protecting Against Rare Event Failures in Archival Systems,
Technical Report UCSC-SSRC-09-03,
April 2009.
Preliminary version of a paper that appeared in MASCOTS 2009.
-
Mark W. Storer,
Secure, Energy-Efficient, Evolvable, Long-Term Archival Storage,
Technical Report UCSC-SSRC-09-01,
March 2009.
2008
-
Kevin Greenan,
Darrell D. E. Long,
Ethan L. Miller,
Thomas Schwarz,
Jay Wylie,
A Spin-Up Saved is Energy Earned: Achieving Power-Efficient, Erasure-Coded Storage,
Proceedings of the Fourth Workshop on Hot Topics in System Dependability (HotDep '08),
December 2008.
-
Mark W. Storer,
Kevin Greenan,
Ian Adams,
Ethan L. Miller,
Darrell D. E. Long,
Kaladhar Voruganti,
Logan: Automatic Management for Evolvable, Large-Scale, Archival Storage,
Proceedings of the 2008 Petascale Data Storage Workshop (PDSW 08),
November 2008.
-
Mark W. Storer,
Kevin Greenan,
Darrell D. E. Long,
Ethan L. Miller,
Secure Data Deduplication,
Proceedings of the 4th International Workshop on Storage Security and Survivability (StorageSS 2008), held in conjunction with the 15th ACM Conference on Computer and Communications Security (CCS 2008),
October 2008.
-
Kevin Greenan,
Ethan L. Miller,
Thomas Schwarz,
Optimizing Galois Field Arithmetic for Diverse Processor Architectures,
Proceedings of the 16th Annual IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2008),
September 2008.
-
Casey Marshall,
Efficient and safe data backup with Arrow,
Technical Report UCSC-SSRC-08-02,
June 2008.
Masters project report.
-
Mark W. Storer,
Kevin Greenan,
Ethan L. Miller,
Kaladhar Voruganti,
Pergamum: Energy-efficient Archival Storage with Disk Instead of Tape,
;login: — The USENIX Magazine 33(3),
June 2008.
-
Mark W. Storer,
Kevin Greenan,
Ethan L. Miller,
Kaladhar Voruganti,
Pergamum: Replacing Tape with Energy Efficient, Reliable, Disk-Based Archival Storage,
Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST '08),
February 2008, pages 1-16.
2007
-
Kevin Greenan,
Ethan L. Miller,
Thomas Schwarz,
Darrell D. E. Long,
Disaster Recovery Codes: Increasing Reliability with Large-Stripe Error Correction Codes,
Proceedings of the 3rd International Workshop on Storage Security and Survivability (StorageSS 2007), held in conjunction with the 14th ACM Conference on Computer and Communications Security (CCS 2007),
October 2007.
-
Kevin Greenan,
Ethan L. Miller,
Thomas Schwarz,
Analysis and Construction of Galois Fields for Efficient Storage Reliability,
Technical Report UCSC-SSRC-07-09,
August 2007.
Revised version published in MASCOTS 2008.
-
Deepavali Bhagwat,
Kave Eshghi,
Pankaj Mehra,
Content-based Document Routing and Index Partitioning for Scalable Similarity-based Searches in a Large Corpus,
Proceedings of the 13th ACM SIGKDD international conference on Knowledge Discovery and Data Mining (KDD '07),
August 2007, pages 105-112.
-
Mark W. Storer,
Kevin Greenan,
Ethan L. Miller,
Kaladhar Voruganti,
POTSHARDS: Secure Long-Term Storage Without Encryption,
Proceedings of the 2007 USENIX Technical Conference,
June 2007, pages 143-156.
-
Jehan-François Pâris,
Thomas Schwarz,
Darrell D. E. Long,
Self-Adaptive Two-Dimensional RAID Arrays,
Proceedings of the International Performance Conference on Computers and Communication (IPCCC '07),
April 2007.
2006
-
Mark W. Storer,
Kevin Greenan,
Ethan L. Miller,
Long-Term Threats to Secure Archives,
Proceedings of the 2nd ACM Workshop on Storage Security and Survivability (StorageSS 2006),
October 2006.
-
Mark W. Storer,
Kevin Greenan,
Ethan L. Miller,
Kaladhar Voruganti,
POTSHARDS: Secure Long-Term Archival Storage Without Encryption,
Technical Report UCSC-SSRC-06-03, Storage Systems Research Center, University of California, Santa Cruz,
September 2006.
Later version published in USENIX 2007.
-
Deepavali Bhagwat,
Kristal Pollack,
Darrell D. E. Long,
Thomas Schwarz,
Ethan L. Miller,
Jehan-François Pâris,
Providing High Reliability in a Minimum Redundancy Archival Storage System,
Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '06),
September 2006, pages 413-421.
-
Thomas Schwarz,
Ethan L. Miller,
Store, forget, and check: Using algebraic signatures to check remotely administered storage,
Proceedings of the IEEE Int'l Conference on Distributed Computing Systems (ICDCS '06),
July 2006.
-
Lawrence You,
Efficient Archival Data Storage,
Technical Report UCSC-SSRC-06-04,
June 2006.
Ph.D. thesis.
2005
-
Mark W. Storer,
Kevin Greenan,
Ethan L. Miller,
Carlos Maltzahn,
POTSHARDS: Storing Data for the Long-Term Without Encryption,
Proceedings of the 3rd International IEEE Security in Storage Workshop,
December 2005.
-
Lawrence You,
Kristal Pollack,
Darrell D. E. Long,
Deep Store: An Archival Storage System Architecture,
Proceedings of the 21st International Conference on Data Engineering (ICDE '05),
April 2005.
-
Joerg Meyer,
Large-Scale Multi-Type Inverted List Indexing,
Masters thesis, University of California, Santa Cruz,
March 2005.
2004
-
Thomas Schwarz,
Qin Xin,
Ethan L. Miller,
Darrell D. E. Long,
Andy Hospodor,
Spencer Ng,
Disk Scrubbing in Large Archival Storage Systems,
Proceedings of the 12th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '04),
October 2004, pages 409-418.
Won Best Paper award.
-
Lawrence You,
Christos Karamanolis,
Evaluation of efficient archival storage techniques,
Proceedings of the 21st IEEE / 12th NASA Goddard Conference on Mass Storage Systems and Technologies,
April 2004.
1998
Last modified 17 Oct 2012
|