Root cause analysis of soft-error-induced failures from hardware and software perspectives

Jinhyo Jung, Yohan Ko, Hwisoo So, Kyoungwoo Lee, Aviral Shrivastava

Research output: Contribution to journalArticlepeer-review


Because the dangers of soft errors are increasing with continued technology scaling, reliability against soft errors is becoming an important design concern for modern embedded systems. Various schemes have been proposed to protect embedded systems from the threat of soft errors, but they incur considerable overheads in terms of cost and performance. Selective protection techniques seem promising because they can achieve high levels of protection with low overhead. Though these techniques can be applied to any system, the most vulnerable parts must first be identified. We, therefore, present CFA, a comprehensive failure analysis framework that can analyze the vulnerability of microarchitectural components and software instructions through intensive fault injection campaigns. With CFA, we also explore the vulnerability of ten benchmarks from the MiBench benchmark suite. We found that protecting a part of the system heavily affects the reliability of the other parts. Therefore, all combinations of protection methods must be examined to present the most efficient and effective protection guidelines. Throughout the experiments, we observed that protection methods offered by single-perspective analyses are sub-optimal. On the other hand, CFA finds the optimal solution in every case, reducing the AVF of a system by up to 82% with minimal protection.

Original languageEnglish
Article number102652
JournalJournal of Systems Architecture
Publication statusPublished - 2022 Sept

Bibliographical note

Funding Information:
This work was partially supported by funding from National Science Foundation Grants No. CNS 1525855 , CPS 1646235 , CCF 1723476 - the NSF/Intel joint research center for Computer Assisted Programming for Heterogeneous Architectures (CAPA) , Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2021-0-00155 , Context and Activity Analysis-based Solution for Safe Childcare), National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. RS-2022-00165225 ), and Samsung Electronics Co., Ltd ( FOUNDRY-202108DD007F ). We would like to thank Editage ( ) for English language editing.

Publisher Copyright:
© 2022 Elsevier B.V.

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture


Dive into the research topics of 'Root cause analysis of soft-error-induced failures from hardware and software perspectives'. Together they form a unique fingerprint.

Cite this