Information Technology
Operating System Support for Restartable File Systems
WARF: P110059US01
Inventors: Michael Swift, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale
The Wisconsin Alumni Research Foundation (WARF) is seeking commercial partners interested in developing an operating system framework that enables a restartable file system.
Overview
Operating systems, such as Microsoft Windows or Linux, provide one or more file systems to store and organize data. A file system stores file data on a storage device and makes the data available to the operating system and applications for manipulation and retrieval. A bug or software defect in a file system may cause the operating system and applications to experience faults that crash or otherwise prevent correct execution. Due to the importance of the data stored in a file system, bugs in file system code generally lead to a system crash, lost data or both.
Typically, operating systems respond to file system faults with a request that the computing device be restarted or forced to restart without warning. A forced shutdown or restart may result in the loss of the current state of software applications being executed as well as incremental user data not yet written to storage. To avoid any damage to stored data, current file systems use careful data recovery processes that trade off the reliability of recently written data with performance, i.e., the more data recovered, the slower the system runs. Furthermore, current files require restarting the operating system or file system to recover, which is slow and results in lost application data. Improvements in the extended restart times and loss of data associated with file system crash recovery are needed.
Typically, operating systems respond to file system faults with a request that the computing device be restarted or forced to restart without warning. A forced shutdown or restart may result in the loss of the current state of software applications being executed as well as incremental user data not yet written to storage. To avoid any damage to stored data, current file systems use careful data recovery processes that trade off the reliability of recently written data with performance, i.e., the more data recovered, the slower the system runs. Furthermore, current files require restarting the operating system or file system to recover, which is slow and results in lost application data. Improvements in the extended restart times and loss of data associated with file system crash recovery are needed.
The Invention
UW–Madison researchers have developed techniques that provide a restartable file system that allows the operating system to respond to faults or failures in the file system without restarting the whole system or losing data. The techniques create a “logical membrane” around the file system. When a file system failure occurs, the failure is isolated without significantly impacting the execution of the operating system or applications.
The four key components of the method are checkpoints, operation logging, unwinding and object tracking. During normal operation, the system logs file system operations, tracks file system objects and periodically performs lightweight checkpoints of the file system state. If a file system crash occurs, the system delays pending file system operations, halts in-progress file system operations and unwinds current operations to a safe state. After recovery, it restores the file system using the most recent checkpoint and rebuilds the file system state using inter-checkpoint logs. Applications are unaware of the crash and restart. Through isolation of the file system, this technique can avoid restarting the operation system in response to file system failures. This improves reliability by allowing applications to keep executing without losing state and improves the user experience.
The four key components of the method are checkpoints, operation logging, unwinding and object tracking. During normal operation, the system logs file system operations, tracks file system objects and periodically performs lightweight checkpoints of the file system state. If a file system crash occurs, the system delays pending file system operations, halts in-progress file system operations and unwinds current operations to a safe state. After recovery, it restores the file system using the most recent checkpoint and rebuilds the file system state using inter-checkpoint logs. Applications are unaware of the crash and restart. Through isolation of the file system, this technique can avoid restarting the operation system in response to file system failures. This improves reliability by allowing applications to keep executing without losing state and improves the user experience.
Applications
- Operating system frameworks to provide more stable and functional performance
Key Benefits
- Reduces operating system restarts due to file system failure
- Improves data preservation after recovery
- Simplifies file system checkpoints
- Enables cleanup of resources using unwinding processes
Additional Information
For More Information About the Inventors
Publications
For current licensing status, please contact Emily Bauer at [javascript protected email address] or 608-960-9842
- Sundararaman S., Subramanian S., Rajimwale A., Arpaci-Dusseau A., Arpaci-Dusseau R. and Michael S. 2010. Membrane: Operating System Support for Restartable File Systems. Trans. Storage. 6, 1-30.