Loading...
Please wait, while we are loading the content...
EFFICIENT TASK PARTITIONING ALGORITHMS FOR DISTRIBUTED SHARED MEMORY SYSTEMS
| Content Provider | CiteSeerX |
|---|---|
| Author | Ray, Sibabrata Jiang, Hong |
| Abstract | In this paper, we consider the tree task graphs which arise from many important programming paradigms such as divide and conquer, branch and bound etc., and the linear task-graphs that stem from common computation schemes such as pipelining, iterative calculation etc.. The target architecture considered is a distributed shared memory architecture with indirect network or wormhole-routed direct network as its interconnection network (IN). The partitioning problem is formulated for systems where all or most of the communications are interleaved with task executions, in light of latency-hiding techniques employed in most of today’s multiprocessors. This is in contrast with previous research in this area where computation and communication are assumed to be completely non-overlapped. We propose a general method for solving the partitioning problem. This method involves three parameters, viz., bandwidth requirement, bottleneck, and load balancing factor that are used to judge the effectiveness of the partitioning and thus make the partitioning problem somewhat independent of mapping problem. Optimal sequential and parallel algorithms for partitioning tree task graphs such that the bottleneck is minimized with minimum number of processors are developed. It is shown that the bandwidth-requirement minimization problem is NP-complete for trees. However, the bandwidth-requirement minimization problem is known to be polynomial for linear task graphs, which are a special case of tree task graphs. An improved algorithm for obtaining the bandwidth-requirement minimizing partition of linear task graphs is presented. Furthermore, three effective, simple 2pass heuristics for computing bandwidth-requirement minimizing partition of tree task graphs are also presented. Parallel versions of the heuristics are developed as well. The effectiveness and efficiency of those heuristics are validated through extensive simulations. Key words. Distributed shared memory multiprocessors, Task graph partitioning, Bandwidth and bottleneck minimization, Load balancing. 1 1 |
| File Format | |
| Access Restriction | Open |
| Subject Keyword | Tree Task Graph Partitioning Problem Bandwidth-requirement Minimization Problem Linear Task Graph Linear Task-graphs Interconnection Network Improved Algorithm Distributed Shared Memory Architecture Latency-hiding Technique Target Architecture Wormhole-routed Direct Network Iterative Calculation Etc Parallel Version Common Computation Scheme Extensive Simulation Task Graph Partitioning General Method Bandwidth-requirement Minimizing Partition Load Balancing Factor Bandwidth Requirement Shared Memory Multiprocessor Task Execution Indirect Network Bound Etc Today Multiprocessor |
| Content Type | Text |