GemV

A validated toolset for the early exploration of system reliability

Karthik Tanikella, Yohan Koy, Reiley Jeyapaul, Kyoungwoo Lee, Aviral Shrivastava

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Decades of technology scaling has brought the threat of soft errors to modern embedded processors. Though several methods have been proposed to protect systems from soft errors, their effectiveness in ensuring error-free computing cannot be guaranteed; without accurate and quantitative estimation of system reliability. The metric vulnerability - which defines the likelihood of device failure by accurately evaluating the time it is exposed to soft errors - provides the most effective means to perform early design space explorations to estimate system reliability in the presence of transient soft errors. In this paper, we present gemV - the first accurate and comprehensive vulnerability estimation toolset, which is configurable and extendible to analyse future/novel architecture and microarchitecture designs. Some of the key features of gemV are: (1) all possible microarchitecture components that store bits, even temporarily, are modeled for their vulnerability in the gem5 cycle-accurate simulation platform, (2) its models have been validated (<3% correlation error with 90% statistical confidence) through exhaustive bit-level fault injection experiments, (3) the analytical models have incorporated microarchitecture-level masking effects like speculative executions, flushes, and etc. (4) the modular design of the vulnerability models make it easy to be extended and integrated when novel microarchitecture designs are explored. In addition to microarchitecture-level evaluation of system reliability, gemV provides a means to perform software-level design space explorations - that explore performance-vulnerability trade-offs of algorithm choices, compilers used, compiler optimization levels, etc. A system designer can further use gemV to explore the performance-vulnerability trade-offs of choosing different ISAs.

Original languageEnglish
Title of host publication2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages159-163
Number of pages5
Volume2016-November
ISBN (Electronic)9781509015030
DOIs
Publication statusPublished - 2016 Nov 28
Event27th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016 - London, United Kingdom
Duration: 2016 Jul 62016 Jul 8

Other

Other27th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016
CountryUnited Kingdom
CityLondon
Period16/7/616/7/8

Fingerprint

Analytical models
Experiments

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Tanikella, K., Koy, Y., Jeyapaul, R., Lee, K., & Shrivastava, A. (2016). GemV: A validated toolset for the early exploration of system reliability. In 2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016 (Vol. 2016-November, pp. 159-163). [7760786] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASAP.2016.7760786
Tanikella, Karthik ; Koy, Yohan ; Jeyapaul, Reiley ; Lee, Kyoungwoo ; Shrivastava, Aviral. / GemV : A validated toolset for the early exploration of system reliability. 2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016. Vol. 2016-November Institute of Electrical and Electronics Engineers Inc., 2016. pp. 159-163
@inproceedings{6a25f070bf6742319dff7e391d609338,
title = "GemV: A validated toolset for the early exploration of system reliability",
abstract = "Decades of technology scaling has brought the threat of soft errors to modern embedded processors. Though several methods have been proposed to protect systems from soft errors, their effectiveness in ensuring error-free computing cannot be guaranteed; without accurate and quantitative estimation of system reliability. The metric vulnerability - which defines the likelihood of device failure by accurately evaluating the time it is exposed to soft errors - provides the most effective means to perform early design space explorations to estimate system reliability in the presence of transient soft errors. In this paper, we present gemV - the first accurate and comprehensive vulnerability estimation toolset, which is configurable and extendible to analyse future/novel architecture and microarchitecture designs. Some of the key features of gemV are: (1) all possible microarchitecture components that store bits, even temporarily, are modeled for their vulnerability in the gem5 cycle-accurate simulation platform, (2) its models have been validated (<3{\%} correlation error with 90{\%} statistical confidence) through exhaustive bit-level fault injection experiments, (3) the analytical models have incorporated microarchitecture-level masking effects like speculative executions, flushes, and etc. (4) the modular design of the vulnerability models make it easy to be extended and integrated when novel microarchitecture designs are explored. In addition to microarchitecture-level evaluation of system reliability, gemV provides a means to perform software-level design space explorations - that explore performance-vulnerability trade-offs of algorithm choices, compilers used, compiler optimization levels, etc. A system designer can further use gemV to explore the performance-vulnerability trade-offs of choosing different ISAs.",
author = "Karthik Tanikella and Yohan Koy and Reiley Jeyapaul and Kyoungwoo Lee and Aviral Shrivastava",
year = "2016",
month = "11",
day = "28",
doi = "10.1109/ASAP.2016.7760786",
language = "English",
volume = "2016-November",
pages = "159--163",
booktitle = "2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Tanikella, K, Koy, Y, Jeyapaul, R, Lee, K & Shrivastava, A 2016, GemV: A validated toolset for the early exploration of system reliability. in 2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016. vol. 2016-November, 7760786, Institute of Electrical and Electronics Engineers Inc., pp. 159-163, 27th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016, London, United Kingdom, 16/7/6. https://doi.org/10.1109/ASAP.2016.7760786

GemV : A validated toolset for the early exploration of system reliability. / Tanikella, Karthik; Koy, Yohan; Jeyapaul, Reiley; Lee, Kyoungwoo; Shrivastava, Aviral.

2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016. Vol. 2016-November Institute of Electrical and Electronics Engineers Inc., 2016. p. 159-163 7760786.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - GemV

T2 - A validated toolset for the early exploration of system reliability

AU - Tanikella, Karthik

AU - Koy, Yohan

AU - Jeyapaul, Reiley

AU - Lee, Kyoungwoo

AU - Shrivastava, Aviral

PY - 2016/11/28

Y1 - 2016/11/28

N2 - Decades of technology scaling has brought the threat of soft errors to modern embedded processors. Though several methods have been proposed to protect systems from soft errors, their effectiveness in ensuring error-free computing cannot be guaranteed; without accurate and quantitative estimation of system reliability. The metric vulnerability - which defines the likelihood of device failure by accurately evaluating the time it is exposed to soft errors - provides the most effective means to perform early design space explorations to estimate system reliability in the presence of transient soft errors. In this paper, we present gemV - the first accurate and comprehensive vulnerability estimation toolset, which is configurable and extendible to analyse future/novel architecture and microarchitecture designs. Some of the key features of gemV are: (1) all possible microarchitecture components that store bits, even temporarily, are modeled for their vulnerability in the gem5 cycle-accurate simulation platform, (2) its models have been validated (<3% correlation error with 90% statistical confidence) through exhaustive bit-level fault injection experiments, (3) the analytical models have incorporated microarchitecture-level masking effects like speculative executions, flushes, and etc. (4) the modular design of the vulnerability models make it easy to be extended and integrated when novel microarchitecture designs are explored. In addition to microarchitecture-level evaluation of system reliability, gemV provides a means to perform software-level design space explorations - that explore performance-vulnerability trade-offs of algorithm choices, compilers used, compiler optimization levels, etc. A system designer can further use gemV to explore the performance-vulnerability trade-offs of choosing different ISAs.

AB - Decades of technology scaling has brought the threat of soft errors to modern embedded processors. Though several methods have been proposed to protect systems from soft errors, their effectiveness in ensuring error-free computing cannot be guaranteed; without accurate and quantitative estimation of system reliability. The metric vulnerability - which defines the likelihood of device failure by accurately evaluating the time it is exposed to soft errors - provides the most effective means to perform early design space explorations to estimate system reliability in the presence of transient soft errors. In this paper, we present gemV - the first accurate and comprehensive vulnerability estimation toolset, which is configurable and extendible to analyse future/novel architecture and microarchitecture designs. Some of the key features of gemV are: (1) all possible microarchitecture components that store bits, even temporarily, are modeled for their vulnerability in the gem5 cycle-accurate simulation platform, (2) its models have been validated (<3% correlation error with 90% statistical confidence) through exhaustive bit-level fault injection experiments, (3) the analytical models have incorporated microarchitecture-level masking effects like speculative executions, flushes, and etc. (4) the modular design of the vulnerability models make it easy to be extended and integrated when novel microarchitecture designs are explored. In addition to microarchitecture-level evaluation of system reliability, gemV provides a means to perform software-level design space explorations - that explore performance-vulnerability trade-offs of algorithm choices, compilers used, compiler optimization levels, etc. A system designer can further use gemV to explore the performance-vulnerability trade-offs of choosing different ISAs.

UR - http://www.scopus.com/inward/record.url?scp=85006918776&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006918776&partnerID=8YFLogxK

U2 - 10.1109/ASAP.2016.7760786

DO - 10.1109/ASAP.2016.7760786

M3 - Conference contribution

VL - 2016-November

SP - 159

EP - 163

BT - 2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Tanikella K, Koy Y, Jeyapaul R, Lee K, Shrivastava A. GemV: A validated toolset for the early exploration of system reliability. In 2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2016. Vol. 2016-November. Institute of Electrical and Electronics Engineers Inc. 2016. p. 159-163. 7760786 https://doi.org/10.1109/ASAP.2016.7760786