Loading...
Please wait, while we are loading the content...
Similar Documents
Topology Reconfiguration Problem for Core-Level Redundancy in Homogeneous Chip Many-Core Processors
| Content Provider | Semantic Scholar |
|---|---|
| Author | Zhang, L. M. Han, Yinhe Xu, Qiang Li, Xiaowei |
| Copyright Year | 2007 |
| Abstract | In recent years, significant research has been undertaken on tera-scale computing that is able to integrate tens to hundreds of homogeneous processor cores on a single chip to process massive amounts of information in parallel. For example, an 80-core teraflop processor prototype was demonstrated at Intel Developer Forum 2006 [1]. Effective defect tolerance techniques are essential for these complex chips, because the profitability depends heavily on the fabrication yield. Currently, most research efforts focus on microarchitecture-level redundancy while core-level redundancy was given little attention. However, as shown in [2] [3], when the number of cores is large, core-level redundancy is considered more appropriate because single core becomes small and inexpensive when compared to the entire chip. There are two schemes to design chip many-core processors with core-level redundancy, namely As Many As Available (AMAA) and As Many As Demand (AMAD). AMAA maximizes the use of operational cores by disabling faulty cores only. We may get various types of degraded chips (with different number of faulty cores) and the yield of the demanded N-core processor can not be promised. In addition, many different degraded versions may cause some confusion in marketing. In AMAD scheme, an N-core processor is provided with M spare cores and we always provide customers with N operational cores. That is, it is possible that there are fault-free cores left unused. In this paper, we consider to employ AMAD scheme in chip many-core processors. For homogeneous chip many-core processors, on-chip communication greatly affects the performance of parallel applications, because these cores work cooperatively through on-chip networks. As a result, topology is an important factor in such architectures. Operating System (OS) should know how underlying cores are connected to dispatch and schedule tasks. Parallel programmers should optimize applications based on topologies. For example, in Microsoft Windows Server 2003 [4], a hardware interface is provided to pass the underlying physical topology information, which is used by Windows to manage processors. Topology information is also supplied to programmers through API functions in order to optimize the performance of the program [4]. In AMAD scheme, faulty cores will be replaced by spare ones and the topology of the target design may be changed. Different chips may have different underlying topologies. Programmers have to face various topologies to optimize their parallel programs, which is a great burden for the programmers. To address the problem, the concept of logical topology, the degraded version of the reference topology (discussed later) is introduced in this short paper. We then briefly introduce a topology reconfiguration problem based on two performance metrics of the system. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.carch.ac.cn/~LeiZhang/papers/DSN07FastAbs(ST7_P13).pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |