ゼミナールI講演

日時: 平成25年6月26日(水) 13:30 -- 15:00
Wed., Jun. 26th, 2013 (13:30 -- 15:00)
場所: L1

講演者: Virendra Singh (Indian Institute of Technology (IIT) Bombay)
題目: Architecting a Reliable System
概要: Relentless scaling of silicon fabrication technology coupled with lower design tolerances are making ICs increasing susceptible to wear-out related permanent faults as well as transient faults (soft errors). A well known technique for tackling both transient and permanent faults is redundant execution, specifically space redundancy, wherein a program is executed redundantly on different processors, pipelines or functional units and the results are compared to detect faults.

This talk will describe a power-efficient architecture for redundant execution on chip multiprocessors (CMPs) which when coupled with our per-core dynamic voltage and frequency scaling (DVFS)algorithm significantly reduces the power overhead of redundant execution without sacrificing performance. Using cycle accurate simulation combined with an architectural power model we estimate that our architecture reduces dynamic power dissipation in the redundant core by an mean value of 76% with an associated mean performance penalty of only 1.2%. I also present an extension to our architecture that enables the use of cores with faulty functional units for redundant execution without a reduction in transient fault coverage. This extension enables the usage of faulty cores, thereby increasing yield and reliability with only a modest power-performance penalty over fault-free execution.

Our second architecture addresses the issue of throughput loss in fault-tolerant CMPs. This is done by using coarse-grained multithreading to multiplex multiple trailing threads on a single core. Our evaluation shows that this architecture delivers higher throughput than previous proposals, including one configuration that uses simultaneous multithreading (SMT) to multiplex trailing threads. This increase in throughput comes at a modest cost in single-thread performance. Finally, circuit and device level techniques will be discussed briefly to deal with such issues.

ゼミナール I, II ページへ