Loading...
Please wait, while we are loading the content...
Similar Documents
Tuning stationary iterative solvers for fault resilience
| Content Provider | ACM Digital Library |
|---|---|
| Author | Anzt, Hartwig Dongarra, Jack Quintana-OrtÃ, Enrique S. |
| Abstract | As the transistor's feature size decreases following Moore's Law, hardware will become more prone to permanent, intermittent, and transient errors, increasing the number of failures experienced by applications, and diminishing the confidence of users. As a result, resilience is considered the most difficult under addressed issue faced by the High Performance Computing community. In this paper, we address the design of error resilient iterative solvers for sparse linear systems. Contrary to most previous approaches, based on Krylov subspace methods, for this purpose we analyze stationary component-wise relaxation. Concretely, starting from a plain implementation of the Jacobi iteration, we design a low-cost component-wise technique that elegantly handles bit-flips, turning the initial synchronized solver into an asynchronous iteration. Our experimental study employs sparse incomplete factorizations from several practical applications to expose the convergence delay incurred by the fault-tolerant implementation. |
| Starting Page | 1 |
| Ending Page | 8 |
| Page Count | 8 |
| File Format | |
| ISBN | 9781450340113 |
| DOI | 10.1145/2832080.2832081 |
| Language | English |
| Publisher | Association for Computing Machinery (ACM) |
| Publisher Date | 2015-11-15 |
| Publisher Place | New York |
| Access Restriction | Subscribed |
| Subject Keyword | Fault tolerance High performance computing Stationary (and asynchronous) iterative solvers Resilience Sparse linear systems |
| Content Type | Text |
| Resource Type | Article |