Making Sense of File Systems Through Provenance and Rich Metadata (Aleatha Parker-Wood)

Aleatha gives a practice run of her advancement talk.

Our proposal addresses the question of how users can quickly find and manage files, without burdening the file system with expensive brute force searches, or requiring the user to become an expert in query languages. We propose a number of algorithms for application in any searchable file system. By collecting new metadata, including file system provenance, we propose to provide new ranking algorithms which are efficient and effective on large multi-user file systems. By exploiting human-like distributions in rich metadata, we intend to reduce the burden of file naming, allowing the system to generate expressive, unique file names on the fly. And since security is a concern on many scientific computing systems, we intend to analyze the security properties of the proposed ranking algorithms, and demonstrate how our ranking algorithm degrades gracefully from the ideal ranking when applied in a setting with restrictive security permissions. We will validate our results using real world scientific data, and provide statistical analyses of rich metadata and provenance from this data. And we will validate our ranking and naming algorithms through a series of in situ user studies.

Monday, January 23, 2012 at 1:00 PM


SSRC Contact:
Parker-Wood, Aleatha

Last modified 24 May 2019