Tracking Emigrant Data via Transient Provenance

Appeared in Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP '11).


Information leaks are a constant worry for companies and government organizations. After a leak occurs it is very important for the data owner to not only determine the extent of the leak, but who originally leaked the information. We propose a technique to extend data provenance to aid in determining potential sources of information leaks. While data provenance is commonly defined as the ancestry of a file, the ancestry recorded depends on the provenance collector. Instead of only recording where a file came from, we propose to also track when and where a file leaves the system. To track these departures, we suggest the use of ghost objects when a file is either written to a mounted external storage device or copied to a client machine via NFS or any other network interface such as SSH or FTP. We present our solution for tracking emigrant data and explain the minor changes to current provenance-aware storage systems required to enable our solution.

Publication date:
June 2011

Stephanie Jones
Christina Strong
Darrell D. E. Long
Ethan L. Miller

Secure File and Storage Systems
Scalable File System Indexing
Dynamic Non-Hierarchical File Systems

Available media

Full paper text: PDF

Bibtex entry

  author       = {Stephanie Jones and Christina Strong and Darrell D. E. Long and Ethan L. Miller},
  title        = {Tracking Emigrant Data via Transient Provenance},
  booktitle    = {Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP '11)},
  month        = jun,
  year         = {2011},
Last modified 28 May 2019