Magellan: A Searchable Metadata Architecture for Large-Scale File Systems

Published as Storage Systems Research Center Technical Report UCSC-SSRC-09-07.

Abstract

As file systems continue to grow, metadata search is becoming an increasingly important way to access and manage files. However, existing solutions that build a separate metadata database outside of the file system face consistency and management challenges at large-scales. To address these issues, we developed Magellan, a new large-scale file system metadata architecture that enables the file system’s metadata to be efficiently and directly searched. This allows Magellan to avoid the consistency and management challenges of a separate database, while providing performance comparable to that of other large file systems. Magellan enables metadata search by introducing several techniques to metadata server design. First, Magellan uses a new on-disk inode layout that makes metadata retrieval efficient for searches. Second, Magellan indexes inodes in data structures that enable fast, multi-attribute search and allow all metadata lookups, including directory searches, to be handled as queries. Third, a query routing technique helps to keeps the search space small, even at large-scales. Fourth, a new journaling mechanism enables efficient update performance and metadata reliability. An evaluation with real-world metadata from a file system shows that, by combining these techniques, Magellan is capable of searching millions of files in under a second, while providing metadata performance comparable to, and sometimes better than, other large-scale file systems.

Publication date:
November 2009

Authors:
Andrew Leung
Ian Adams
Ethan L. Miller

Projects:
Scalable File System Indexing
HECURA: Scalable Data Management
Ultra-Large Scale Storage

Available for download:

Full text:
Download as PDF

Bibtex entry

@techreport{leung09-ssrctr0907,
  author       = {Andrew Leung and Ian Adams and Ethan L. Miller},
  title        = {Magellan: A Searchable Metadata Architecture for Large-Scale File
Systems},
  institution  = {University of California, Santa Cruz},
  number       = {UCSC-SSRC-09-07},
  month        = nov,
  year         = {2009},
}
Last modified 28 Jun 2010