Secure File and Storage Systems

We investigate the use of strong authentication, encryption, and other mechanisms to safeguard data stored in network-attached storage systems and long-term archival storage systems. Adding security to large storage systems presents a severe challenge to scalability that we are addressing using aggregate capabilities. We are also exploring protocols to verify remote storage and formal verification of secure network-attached storage.

Our secure storage research has several thrusts:

  • Secure archival storage systems. This thrust includes the POTSHARDS project, which completed around 2010, as well as current research on security for archival storage systems using combinations of random blocks to provide a strong source of  entropy, helping to guard against long-term "cracking".
  • Deniable storage systems. These systems store data in a way that prevent an attacker from even knowing that data exists.
  • Secure deletion. The Lethe project is exploring techniques for securely deleting data from any file system by forgetting as little information as possible. Our current approach allows a user to "forget" data by securely disposing of a single 128-bit key. By explicitly carrying forward all data under a new key, this approach allows a storage system to securely delete information quickly, meeting security requirements for regulations such as GDPR and CCPA, as well as providing privacy guarantees for deleted data for users.
  • Scalable security for HPC systems. This work developed techniques for using a single key to encrypt an entire file, while still allowing secure distribution of parts of the file to different compute nodes. A node with one part of the file cannot read any other part of the file, providing strong security while still allowing for highly parallel computation on stored data.

Status

We have designed and implemented Horus, a system that offers fine-grained encryption-based security for large-scale storage. Horus encrypts large datasets using keyed hash trees (KHT) to generate different keys for each region of the dataset, providing fine-grained security. KHT also reduces key management and distribution overhead. The design of Horus provides end-to-end data encryption and can reduce the need to trust system operators or cloud service providers. Performance evaluation shows that our prototype’s key distribution is highly scalable and robust. There is a version of the library available for download.

We have integrated security into Ceph. Our approach to security in Ceph allows secure access by hundreds of thousands of clients to a single file spread across tens of thousands of object-based storage devices without taxing the metadata servers or any other part of the system. The prototype implementation we developed imposes only a 6–7% overhead on a metadata-heavy workload involving file opens spread across hundreds of clients. Building on this approach, we are investigating scalable encryption and limiting the effects of compromised computation nodes.

We are investigating a system that integrates the seemingly incompatible features of encryption and deduplication. Combining the two can allow for efficient storage of data under arbitrary classification. However, difficult issues arise in combining these features, such as safe data destruction and privacy preservation in the face of network analysis.

In our work on indexing, we are investigating making search both faster and more secure. We use index partitioning schemes based on file system security metadata. By creating partitions where users can see either every file or no files at all, we can prevent statistical attacks made possible in indexing systems that ignore security restrictions. In addition, the number of indexes we need to search is proportional to the number of files the searcher can see, making search more efficient. The indexing and HECURA pages have more information on the application of security and partitioning to large-scale file systems.

We are also implementing a secure long-term archival storage system, POTSHARDS, that does not rely on encryption, instead, we use secret splitting and approximate pointers to keep data hidden. The archival storage project page has more details on POTSHARDS.

Publications

Last modified 7 Nov 2022