Dynamic Non-Hierarchical File Systems
Modern high-end computing (HEC) systems must manage petabytes (going on exabytes) of data stored in billions of files, yet current techniques for naming and managing files were developed 40 years ago for collections of thousands of files. HEC users are therefore forced to adapt their usage to fit an outdated file system model and interface, unsuitable for exascale systems. We had the opportunity to meet with scientists at several of the national laboratories. We talked with them about the science they do and how they use the supercomputers.
From these discussions we have learned several lessons: 1) Hierarchical namespaces have become a hinderance rather than a help; 2) Currently it is easier, and faster, for scientists to manage their own metadata than try to search for data they have stored; 3) While finding some data can be easy, finding the right or good data is not. From these observations we can see that simply modifying existing high performance filesystems, and the requisite storage of additional semantic metadata, would be woefully inadequate.
We propose to develop a radically different filesystem structure that addresses these problems directly, and which will leverage provenance (a record of the data and processes that contributed to its creation), file content, and rich semantic metadata to provide a scalable and searchable file name space. Such a name space would allow the tracking of data as it moves through the scientific workflow. This allows scientists to better find and utilize the data they need, using both content and data history to identify and manage stored information. We take advantage of the familiar search-based metaphor to provide an initial easy- to-use interface that enables users to find the files they need and evaluate the authenticity and quality of those files. Realizing this vision requires research success in dynamic, nonhierarchical file systems design and implementation, large-scale metadata management, efficient scalable indexing, and automatic provenance capture.
We propose a dynamic nonhierarchical file system which includes automatically collected information flow provenance in addition to traditional metadata. Information flow provenance will automatically create and track relationships among files, allowing a visualization file to be related to the input deck used to create it as well as the calculation that was run. This dynamic and automatic addition of relationships will not only allow the user to be presented with a personalized view of related data, but also potentially allow the user to make connections he/she was otherwise unable to see.
We are exploring the benefits to be gained by expanding on the functionality provided by file system indexes, providing features not typically available in current file systems and search indexes. We are currently working on creating a unified search space over traditional metadata, content-based metadata, and provenance that will help find relevant files regardless of where they are stored. We have examined scientific metadata from a variety of disciplines, in order to better understand its properties. Most metadata studies have focused on POSIX metadata, which is homogenous, low-dimensional, predominately numeric, and has no missing values. However, we have discovered that scientific metadata is heterogeneous, high dimensional, a mixture of numeric, textual, and categorical, and very sparse (even within a single discipline and object type). We are using data from this study to inform choices in designing a new type of on-demand scientific data index.
Additionally, search must enforce file security, however, doing so efficiently is not straightforward. Our techniques allow security information to be used during index partitioning and embedded within each partition. Doing so allows us to eliminate partitions with improper permissions from the search space, improving performance and potentially altering the ordering of returned results.
File system metadata should be treated as an aid to managing and accessing data and not a rigid and limited structure to which the user must conform. To this end we propose to enhance metadata management to provide seamless support for a search-based dynamic interface to the files. File system search provides a clean, powerful abstraction from the file system. It is often easier to specify what one wants using file metadata and extended attributes rather than specifying where to find it. Searchable metadata allows users and administrators to ask complex, ad hoc questions about the properties of the files being stored, helping them to locate, manage, and analyze their data.
Web users are familiar with the problem of “information over- load” in response to a search query; we will reduce this problem in our system by adding importance ranking, and facilitating searches that are restricted to a local region of the provenance and relationship graph. This combination of file relationship information and per-file metadata has strong promise to greatly improve the quality of searches, so we will explore approaches that allow queries to include this information.
In order to do ranking, we are exploring eigenvector analysis on the provenance graph, similar to Googles PageRank. Similarly to a web graph, provenance allows us to examine what files scientists think are useful and worth deriving from. However, naively applying PageRank to a provenance graph simply results in ranking frequently used roots (such as gcc) as most important. Instead, by modifying the PageRank transition function, we can favor newer, less ubiquitous, but still frequently used files.
As we increase the amount of information we store and require access to, optimizing the computing system becomes increasingly important. Such a system must be fast enough to respond to the user, while maintaining an equilibrium between saving energy and using the system to its full potential. This must be accomplished without noticeable degradation in the reliability and security of the system.
We are developing tools to help achieve a balanced, reliable, and secure system of any scale. Horus, a keyed hash tree, encrypts data and supports a much finer-grained approach to security than can currently be achieved. We are developing a data allocation algorithm that optimizes for multiple objectives, including energy, performance, and reliability, that will optimally place data on devices.