Loading...
Please wait, while we are loading the content...
Similar Documents
Unification of register allocation and instruction scheduling in compilers for fine-grain parallel architectures
| Content Provider | Semantic Scholar |
|---|---|
| Author | Berson, David A. |
| Copyright Year | 1996 |
| Abstract | The interaction between instruction scheduling and register allocation has significant impact on the quality of code generated, particularly in compilers targeting fine grain parallel architectures. The problem results from the fact that instruction scheduling and register allocation have conflicting goals. Instruction scheduling tries to maximize parallelism by scheduling as many instructions as possible in parallel, which requires a large number of values to be held in registers for short periods of time. On the other hand, register allocation attempts to hold a small number of values in registers for long periods of time, resulting in limiting the number of instructions that can be scheduled in parallel. This dissertation presents a method for unifying these tasks by allocating all needed registers and functional units to an instruction simultaneously. No previous technique has achieved this degree of integration between the two tasks. The work in this dissertation is based on a framework consisting of three components: a technique for measuring a program's demand for all resources, a single intermediate representation of the measured demands, and a set of transformations that perform resource allocation. The approach taken in this work is based on a new paradigm of resource allocation, called Measure and Reduce in which the resource requirements of the program are measured and excessive demands are removed by reduction transformations. The information computed during the measurement of the demands for each resource is incorporated into a single intermediate representation. The reduction transformations for all resources operate on this intermediate representation, allowing transformations for different types of resources to be performed simultaneously. Therefore, an instruction can allocate all resources it needs at once, resulting in unified resource allocation. The intermediate representation is based on a hierarchical form of dependence DAGs, enabling the transformations to naturally handle instruction level parallelism. In particular, the register transformations form a framework for live range splitting in the absence of a full ordering of the instructions, as required by previous splitting techniques. Application of the reduction transformations is first demonstrated by a heuristic for performing register allocation during local instruction scheduling. Global register allocation is performed by exploiting the hierarchical nature of the intermediate representation. Heuristics are also given for using the transformations during global code motion, resulting in unified allocation and a more flexible use of available resources than previous resource constrained techniques. The results of numerous experiments comparing the new techniques to previous attempts at phase integration are reported. The experiments indicate that the unified allocation of resources generates higher quality code than methods that partially integrate the allocation phases. In addition, while precise measurement of register requirements is NP-Complete, in practice precise measurements are obtained easily and efficiently. Thus, when these measurements are combined with traditional register allocation techniques in hybrid algorithms, the quality of code generated is improved. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.cs.virginia.edu/~soffa/research/dissertations/berson.pdf |
| Alternate Webpage(s) | http://www.cs.pitt.edu/~soffa/research/dissertations/berson.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |