Loading...
Please wait, while we are loading the content...
Improving effective bandwidth through compiler enhancement of global cache reuse (2000)
| Content Provider | CiteSeerX |
|---|---|
| Researcher | Ding, Chen |
| Abstract | While CPU speed has been improved by a factor of 6400 over the past twenty years, memory bandwidth has increased by a factor of only 139 during the same period. Consequently, on modern machines the limited data supply simply cannot keep a CPU busy, and applications often utilize only a few percent of peak CPU performance. The hardware solution, which provides layers of high-bandwidth data cache, is not effective for large and complex applications primarily for two reasons: far-separated data reuse and large-stride data access. The first repeats unnecessary transfer and the second communicates useless data. Both waste memory bandwidth. This dissertation pursues a software remedy. It investigates the potential for compiler optimizations to alter program behavior and reduce its memory bandwidth consumption. To this end, this research has studied a two-step transformation strategy: first fuse computations on the same data and then group data used by the same computation. Existing techniques such as loop blocking can be viewed as an application of this strategy within a single loop nest. In order to carry out this strategy |
| File Format | |
| Journal | Journal of Parallel and Distributed Computing |
| Publisher Date | 2000-01-01 |
| Access Restriction | Open |
| Subject Keyword | Memory Bandwidth Consumption First Repeat Unnecessary Transfer Compiler Enhancement Effective Bandwidth Past Twenty Year Cpu Speed Waste Memory Bandwidth High-bandwidth Data Cache Peak Cpu Performance Software Remedy Loop Blocking Data Supply Hardware Solution Second Communicates Useless Data Modern Machine Two-step Transformation Strategy Far-separated Data Reuse Large-stride Data Access Single Loop First Fuse Computation Global Cache Reuse Group Data |
| Content Type | Text |
| Resource Type | Thesis Article |