Loading...
Please wait, while we are loading the content...
Similar Documents
Transparent Fault Tolerance for Parallel Applications on Networks of Workstations (1996)
| Content Provider | CiteSeerX |
|---|---|
| Author | Dec, Daniel Scales Lam, Monica S. Western, Dec Scales, Daniel J. |
| Abstract | This paper describes a new method for providing transparent fault tolerance for parallel applications on a network of workstations. We have designed our method in the context of shared object system called SAM, a portable run-time system which provides a global name space and automatic caching of shared data. SAM incorporates a novel design intended to address the problem of the high communication overheads in distributed memory environments and is implemented on a variety of distributed memory platforms. Our fundamental approach to providing fault tolerance is to ensure the replication of all data on more than one workstation using the dynamic caching already providedby SAM. The replicated data is accessible to the local processor like other cached data, making access to shared data faster and potentially offsetting some of the fault tolerance overhead. In addition, our method uses information available in SAM applications on how processes access shared data to enable several optimiza... |
| File Format | |
| Publisher Date | 1996-01-01 |
| Access Restriction | Open |
| Subject Keyword | Several Optimiza Fault Tolerance Overhead Process Access Global Name Space Automatic Caching Parallel Application Distributed Memory Platform Object System Fundamental Approach Providedby Sam High Communication Overhead Transparent Fault Tolerance Sam Application Distributed Memory Environment Local Processor Novel Design Portable Run-time System Replicated Data Dynamic Caching Method Us Information Fault Tolerance |
| Content Type | Text |
| Resource Type | Article |