A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery

Hwisoo So, Moslem Didehban, Aviral Shrivastava, Kyoungwoo Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

In this work, we investigate the potential of software-only RMT (Redundant MultiThreading) schemes for soft and hard error detection and recovery. We first implement and evaluate the error protection capability of basic software level triple redundant multithreading (STRMT) and analyze its vulnerability. Then we introduce FISHER (FlexIble Soft and Hard Error Resiliency) as a software RMT scheme which can achieve high degree of error resiliency and does not suffer from STRMT vulnerability holes. FISHER executes three threads and rather than having a centralized voting mechanism, it distributes and intertwines error detection and recovery operations between redundant threads. We performed 135,000 soft/hard error injection experiments on different hardware components of an ARM cortex53-like μ-architecturally simulated microprocessor. The results demonstrate that FISHER can reduce programs failure rate by around 42× and 26× compared to original and basic STRMT-protected versions of programs, respectively.

Original languageEnglish
Title of host publicationProceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1559-1562
Number of pages4
ISBN (Electronic)9783981926323
DOIs
Publication statusPublished - 2019 May 14
Event22nd Design, Automation and Test in Europe Conference and Exhibition, DATE 2019 - Florence, Italy
Duration: 2019 Mar 252019 Mar 29

Publication series

NameProceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019

Conference

Conference22nd Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
CountryItaly
CityFlorence
Period19/3/2519/3/29

Fingerprint

Error Recovery
Multithreading
Error Detection
Error detection
Resiliency
Software
Vulnerability
Thread
Failure Rate
Microprocessor
Voting
Microprocessor chips
Injection
Hardware
Evaluate
Demonstrate
Experiment

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering
  • Safety, Risk, Reliability and Quality
  • Control and Optimization

Cite this

So, H., Didehban, M., Shrivastava, A., & Lee, K. (2019). A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery. In Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019 (pp. 1559-1562). [8715089] (Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.23919/DATE.2019.8715089
So, Hwisoo ; Didehban, Moslem ; Shrivastava, Aviral ; Lee, Kyoungwoo. / A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery. Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 1559-1562 (Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019).
@inproceedings{fa93948e0009499a9afb1ced87464ef3,
title = "A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery",
abstract = "In this work, we investigate the potential of software-only RMT (Redundant MultiThreading) schemes for soft and hard error detection and recovery. We first implement and evaluate the error protection capability of basic software level triple redundant multithreading (STRMT) and analyze its vulnerability. Then we introduce FISHER (FlexIble Soft and Hard Error Resiliency) as a software RMT scheme which can achieve high degree of error resiliency and does not suffer from STRMT vulnerability holes. FISHER executes three threads and rather than having a centralized voting mechanism, it distributes and intertwines error detection and recovery operations between redundant threads. We performed 135,000 soft/hard error injection experiments on different hardware components of an ARM cortex53-like μ-architecturally simulated microprocessor. The results demonstrate that FISHER can reduce programs failure rate by around 42× and 26× compared to original and basic STRMT-protected versions of programs, respectively.",
author = "Hwisoo So and Moslem Didehban and Aviral Shrivastava and Kyoungwoo Lee",
year = "2019",
month = "5",
day = "14",
doi = "10.23919/DATE.2019.8715089",
language = "English",
series = "Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "1559--1562",
booktitle = "Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019",
address = "United States",

}

So, H, Didehban, M, Shrivastava, A & Lee, K 2019, A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery. in Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019., 8715089, Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019, Institute of Electrical and Electronics Engineers Inc., pp. 1559-1562, 22nd Design, Automation and Test in Europe Conference and Exhibition, DATE 2019, Florence, Italy, 19/3/25. https://doi.org/10.23919/DATE.2019.8715089

A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery. / So, Hwisoo; Didehban, Moslem; Shrivastava, Aviral; Lee, Kyoungwoo.

Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 1559-1562 8715089 (Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery

AU - So, Hwisoo

AU - Didehban, Moslem

AU - Shrivastava, Aviral

AU - Lee, Kyoungwoo

PY - 2019/5/14

Y1 - 2019/5/14

N2 - In this work, we investigate the potential of software-only RMT (Redundant MultiThreading) schemes for soft and hard error detection and recovery. We first implement and evaluate the error protection capability of basic software level triple redundant multithreading (STRMT) and analyze its vulnerability. Then we introduce FISHER (FlexIble Soft and Hard Error Resiliency) as a software RMT scheme which can achieve high degree of error resiliency and does not suffer from STRMT vulnerability holes. FISHER executes three threads and rather than having a centralized voting mechanism, it distributes and intertwines error detection and recovery operations between redundant threads. We performed 135,000 soft/hard error injection experiments on different hardware components of an ARM cortex53-like μ-architecturally simulated microprocessor. The results demonstrate that FISHER can reduce programs failure rate by around 42× and 26× compared to original and basic STRMT-protected versions of programs, respectively.

AB - In this work, we investigate the potential of software-only RMT (Redundant MultiThreading) schemes for soft and hard error detection and recovery. We first implement and evaluate the error protection capability of basic software level triple redundant multithreading (STRMT) and analyze its vulnerability. Then we introduce FISHER (FlexIble Soft and Hard Error Resiliency) as a software RMT scheme which can achieve high degree of error resiliency and does not suffer from STRMT vulnerability holes. FISHER executes three threads and rather than having a centralized voting mechanism, it distributes and intertwines error detection and recovery operations between redundant threads. We performed 135,000 soft/hard error injection experiments on different hardware components of an ARM cortex53-like μ-architecturally simulated microprocessor. The results demonstrate that FISHER can reduce programs failure rate by around 42× and 26× compared to original and basic STRMT-protected versions of programs, respectively.

UR - http://www.scopus.com/inward/record.url?scp=85066627428&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066627428&partnerID=8YFLogxK

U2 - 10.23919/DATE.2019.8715089

DO - 10.23919/DATE.2019.8715089

M3 - Conference contribution

AN - SCOPUS:85066627428

T3 - Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019

SP - 1559

EP - 1562

BT - Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

So H, Didehban M, Shrivastava A, Lee K. A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery. In Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 1559-1562. 8715089. (Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019). https://doi.org/10.23919/DATE.2019.8715089