Loading...
Please wait, while we are loading the content...
Similar Documents
Space reclamation for uncoordinated checkpointing in message-passing systems. ph.d. thesis
| Content Provider | NASA Technical Reports Server (NTRS) |
|---|---|
| Author | Wang, Yi-Min |
| Copyright Year | 1993 |
| Description | Checkpointing and rollback recovery are techniques that can provide efficient recovery from transient process failures. In a message-passing system, the rollback of a message sender may cause the rollback of the corresponding receiver, and the system needs to roll back to a consistent set of checkpoints called recovery line. If the processes are allowed to take uncoordinated checkpoints, the above rollback propagation may result in the domino effect which prevents recovery line progression. Traditionally, only obsolete checkpoints before the global recovery line can be discarded, and the necessary and sufficient condition for identifying all garbage checkpoints has remained an open problem. A necessary and sufficient condition for achieving optimal garbage collection is derived and it is proved that the number of useful checkpoints is bounded by N(N+1)/2, where N is the number of processes. The approach is based on the maximum-sized antichain model of consistent global checkpoints and the technique of recovery line transformation and decomposition. It is also shown that, for systems requiring message logging to record in-transit messages, the same approach can be used to achieve optimal message log reclamation. As a final topic, a unifying framework is described by considering checkpoint coordination and exploiting piecewise determinism as mechanisms for bounding rollback propagation, and the applicability of the optimal garbage collection algorithm to domino-free recovery protocols is demonstrated. |
| File Size | 3275912 |
| Page Count | 119 |
| File Format | |
| Alternate Webpage(s) | http://archive.org/details/NASA_NTRS_Archive_19940025401 |
| Archival Resource Key | ark:/13960/t3kx0b14q |
| Language | English |
| Publisher Date | 1993-11-19 |
| Access Restriction | Open |
| Subject Keyword | Computer Programming And Software Distributed Processing Fault Tolerance Protocol Computers Algorithms Interprocessor Communication Error Detection Codes Message Processing Failure Ntrs Nasa Technical Reports ServerĀ (ntrs) Nasa Technical Reports Server Aerodynamics Aircraft Aerospace Engineering Aerospace Aeronautic Space Science |
| Content Type | Text |
| Resource Type | Thesis |