Data Intensive Computing & SSDs

Flash-based Solid State Drives (SSDs) can help to improve Directory Lookup and Checkpointing performance. Fast Directory Accesses and Look Up. In current large scale file systems with billions of files, directory accesses are well-known to be a critical bottleneck. Since the directory is too large to fit into memory, the access latency of the disks becomes the dominating factor for the overall performance. How to design a large scale file directory to be stored on a storage hierarchy consists of both SSDs and disks to meet the performance challenge is the focus of our project. An important contribution of this project will be a demonstration of how flash memory can be used as part of this hierarchy to provide fast directory accesses and look ups. Checkpointing. Flash memory opens the possibility of addressing specific performance problems in high-end computing. In a large scale high-end computing environment, the number of CPUs and storage devices is extremely large. A critical large-scale simulation and modeling program may run for days or even weeks. It is likely that one of the components may fail during the execution of the program, which would require the entire simulation to be restarted from scratch. To quickly recover from a failed component, the state of the system is periodically checkpointed. Since the volume of checkpoint data to be saved can be extremely large, it is typically written directly to disk. Improving the performance of checkpointing will have a huge impact on the completion time of large scale simulation and modeling applications. One of the primary goals of this project is to demonstrate how flash memory can be used to improve write performance for checkpointing. For example, checkpointing applications and database systems typically use dedicated disks as a log device to improve write performance. Due to the high latency of disks, however, this solution does not scale well. Some systems use a battery-backed DRAM as a nonvolatile RAM (NVRAM) to speed up the write process. Due to the cost of the DRAM and limited battery power, this approach also does not scale well. Since flash memory is nonvolatile, we can replace the NVRAM with flash. Moreover, the low cost of flash compared to battery-backed up DRAM will allow for substantially larger log devices for synchronous write intensive applications. However, flash memory is not as fast as DRAM. We need to investigate a new architecture to integrate the SSDs into the storage hierarchy for fast checkpointing. SSD will serve as a temporary buffer space between CPUs and disks. How to reduce the times of getting checkpointing data into SSDs and migrating the data from SSDs to disks is still very challenging.

Status

Ongoing

Publications

Last modified 23 May 2019