Loading...
Please wait, while we are loading the content...
Similar Documents
The Conjugate Gradient Method for Large Sparse Matrices on the Intel Ipsc/860 Hypercube
| Content Provider | Semantic Scholar |
|---|---|
| Author | Gerritsen, Margot |
| Copyright Year | 1994 |
| Abstract | For large sparse unstructured matrices, the critical parts of the Conjugate Gradient method on the iPSC/860 are the inter-processor communications needed for the matrix-vector multiplication and the vector-updates. In this work several implementations are tested and discussed in search for an optimal algorithm. They di er in distribution of the matrix and the various vectors over the processorgrid leading to di erent communication routines. Timings improve considerably if the communicationsmentioned above can be implemented using bidirectional forced messages between pairs of processors. This e ect was seen very clearly when the Intel global collocation algorithmGCOLX was replaced by a routine that utilizes these pairwise forced messages, leading to a sharp decrease in communication times. Mapping the matrix to a rectangular processorgrid leads to more e cient algorithms. In this work, the best timings were obtained for a mapping to a square pnp pnp grid, where np is the number of processors. The implementation is written for the special case in which the dimension of the matrix is a multiple of pnp. This distribution allows the use of bidirectional forced messages for all communications and it keeps the message lenghts short. Two of the matrices tested have a strongly nonuniform distribution of the nonzeros in the matrix. A routine that performs well on these matrices is one in which the distribution of the matrix over the processors is load balanced. This reduces the time required for the matrix-vector multiplication, with the penalty of increasing the communication times. For small, or relatively dense, matrices with a strong nonuniform distribution the gain in the matrix-vector multiplication outweighs the decreased communication performance. The stopcriterion used in the CG algorithm is the one developed by Kaasschieter [7]. Diagonal preconditioning is implemented, which lead to convergence of most of the test matrices considered. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www-sccm.stanford.edu/pub/sccm/sccm94-02.ps.gz |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |