Coarse-Grained Reconfigurable Architectures (CGRAs) are drawing significant attention since they promise both performances with parallelism and flexibility with reconfiguration. Soft errors (or transient faults) are becoming a serious design concern in embedded systems including CGRAs since the soft error rate is increasing exponentially as technology is scaling. A recently proposed software-based technique with TMR (Triple Modular Redundancy) implemented on CGRAs incurs extreme overheads in terms of runtime and energy consumption mainly due to expensive voting mechanisms for the outputs from the triplication ofevery operation. In this article, we propose selective validation mechanisms for efficient modular redundancy techniques in the datapaths on CGRAs. Our techniques selectively validate the results at synchronous operations rather than every operation in order to reduce the expensive performance overhead from the validation mechanism. We also present an optimization technique to further improve the runtime and the energy consumption by minimizing synchronous operations where a validating mechanism needs to be applied. Our experimental results demonstrate that our selective validation-based TMR technique with our optimization on CGRAs can improve the runtime by 41.0% and the energy consumption by 26.2% on average over benchmarks as compared to the recently proposed software-based TMR technique with the full validation.
Bibliographical noteFunding Information:
This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science; ICT and Future Planning (No. 2015R1A2A1A15053435); MSIP (Ministry of Science, ICT and Future Planning) under the Research Project on High Performance and Scalable Manycore Operating System (No. 14-824-09-011); Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science; ICT and Future Planning (NRF-2015M3C4A7065522); the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIP) (No. 2014R1A2A1A10051792); IDEC, the Brain Korea Plus Project in 2015; ICT at Seoul National University, Inter-University Semiconductor Research Center (ISRC); Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korean government (MSIP) (No. R0190-15-2010; Development on the SW/HW Modules of Processor Monitor for System Intrusion Detection); the MSIP, Korea, under the ITRC (Information Technology Research Center) support program (IITP-2015-R0992-15-1006) supervised by the IITP (Institute for Information & Communications Technology Promotion); and IITP grant funded by the Korean government (MSIP) (No. B0101-15-0155, The Core Technology Development of SW-SoC Convergence Platform for Hyper-Connection Services Among Smart Devices Based on Heterogeneous Multi-core Clusters).
© 2016 ACM.
All Science Journal Classification (ASJC) codes
- Hardware and Architecture