Deduplicating in-line data on primary storage is hampered by the disk bottleneck problem, an issue which results from the need to keep an index mapping portions of data to hash values in memory in order to detect duplicate data without paying the performance penalty of disk paging. The index size is proportional to the volume of unique data, so placing the entire index into RAM is not cost effective with a deduplication ratio below 45%.
HANDS is a framework that dynamically pre-fetches fingerprints from disk into memory cache according to working sets statistically derived from access patterns. We use a simple neighborhood grouping as our statistical technique to demonstrate the effectiveness of our approach. HANDS is modular and requires only spatio-temporal data, making it suitable for a wide range of storage systems without the need to modify host file systems. HANDS reduces the amount of in-memory index storage required by up to 99% while still achieving between 30% and 90% of the deduplication a full memory-resident index provides, making primary deduplication cost effective in workloads with deduplication rates as low as 8%.
A partnership between academia and industry exploring and developing new technologies and techniques to improve the manageability, scalability, security, reliability, longevity, and performance of storage systems.
CRSS facilitates collaboration in research and education, and provides pathways to simplify direct transfer of university developed ideas, research results, and technology to its industrial sponsors, helping them to improve their competitive posture in the global marketplace.
CRSS conducts research in a wide range of storage-related fields and applied security, including archival storage, scalable distributed indexing and non-hierarchical file systems, large-scale distributed storage systems, file systems for next-generation storage devices, and data deduplication.
CRSS is looking for talented students who want to study for an M.S. or Ph.D. at UC Santa Cruz! Grad students at CRSS work closely with faculty and other students, as well as with local industry. Our graduates typically have multiple job offers, whether from industry, government, or academia.