Loading...
Please wait, while we are loading the content...
Similar Documents
Coarse-grain coherence tracking
| Content Provider | Semantic Scholar |
|---|---|
| Author | Cipasti, Mikko H. Smith, James Edward Cantin, Jason F. |
| Copyright Year | 2006 |
| Abstract | To maintain coherence in conventional shared-memory multiprocessor systems, processors first check other processors' caches before obtaining data from memory. This coherence checking consumes considerable interconnect bandwidth in broadcast-based systems, and, as a byproduct, increases access latency for nonshared data. Furthermore, it consumes substantial amounts of power, both in the system interconnect and cache tag arrays. Simulation results for a set of commercial, scientific, and multiprogrammed workloads running on a four-processor system show an average of 71% (and up to 94%) of broadcasts are unnecessary, and on average 89% of snoop-induced cache tag lookups miss in the L2 cache. This dissertation proposes Coarse-Grain Coherence Tracking (CGCT), a new technique that supplements a conventional coherence mechanism and optimizes the performance of conventional coherence enforcement. CGCT monitors the coherence status of large regions of memory and uses the status information to avoid unnecessary broadcasts and filter unnecessary snoop-induced cache tag lookups. Simulation results how CGCT can eliminate 47--64% of the broadcasts, filter 71--87% of the snoop-induced cache tag lookups, and reduce average execution time 7.3--10.9%. Moreover, CGCT does not affect system compatibility, does not violate cache coherence, and does of violate memory consistency. In addition to optimizing coherence enforcement, CGCT can enable new optimizations that further improve performance and power-efficiency. In this dissertation I will show that CGCT can enable processors to prefetch data in a safe, efficient, and timely manner and without disturbing other processors. I will also show that CGCT can be used to implement power-efficient DRAM speculation in the memory controllers, detecting regions shared by other processors, and only fetching lines from DRAM if they are not likely to be sourced from another processor's cache. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.9766&rep=rep1&type=pdf |
| Alternate Webpage(s) | http://pharm.ece.wisc.edu/papers/cantin_thesis.pdf |
| Alternate Webpage(s) | http://www.jfred.org/cantin_thesis.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |