Design and implementation of a predictive file prefetching algorithm

Appeared in Proceedings of the 2001 USENIX Annual Technical Conference.


We have previously shown that the patterns in which files are accessed offer information that can accurately predict upcoming file accesses. Most modern caches ignore these patterns, thereby failing to use information that enables significant reductions in I/O latency. While prefetching heuristics that expect sequential accesses are often effective methods to reduce I/O latency, they cannot be applied across files, because the abstraction of a file has no intrinsic concept of a successor. This limits the ability of modern file systems to prefetch. Here we presents our implementation of a predictive prefetching system, that makes use of file access patterns to reduce I/O latency.

Previously we developed a technique called Partitioned Context Modeling (PCM) [1] that efficiently models file accesses to reliably predict upcoming requests. We present our experiences in implementing predictive prefetching based on file access patterns. From the lessons learned we developed of a new technique Extended Partitioned Context Modeling (EPCM), which has even better performance.

We have modified the Linux kernel to prefetch file data based on Partitioned Context Modeling and Extended Partitioned Context Modeling. With this implementation we examine how a prefetching policy, that uses such models to predict upcoming accesses, can result in large reductions in I/O latencies. We tested our implementation with four different application-based benchmarks and saw I/O latency reduced by 31% to 90% and elapsed time reduced by 11% to 16%.

Publication date:
January 2001

Thomas Kroeger
Darrell D. E. Long

Prediction and Grouping

Available media

Full paper text: PDF

Bibtex entry

  author       = {Thomas Kroeger and Darrell D. E. Long},
  title        = {Design and implementation of a predictive file prefetching algorithm},
  booktitle    = {Proceedings of the 2001 USENIX Annual Technical Conference},
  pages        = {105–118},
  month        = jan,
  year         = {2001},
Last modified 27 Jan 2023