Design techniques to improve the resilience of computing systems: Architectural layer

Aviral Shrivastava, Kyoungwoo Lee, Hwisoo So, Jinhyo Jung, Prudhvi Gali

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Unreliable hardware components will affect computing system at several levels -all the way from incorrect transistor outputs, to incorrect values in memory elements, incorrect program variables and control flow, finally causing application failure. Resilience is the ability of the system to tolerate errors when they occur and comprises two main aspects-(i) how to detect the errors and (ii) how to recover from the errors. The lower the level of abstraction at which we can detect and correct the error, the less disruption it causes to all the upper layers of computing abstraction. This chapter gives the overview of all the techniques at processor architecture level to detect and correct the errors.

Original languageEnglish
Title of host publicationCross-Layer Reliability of Computing Systems
PublisherInstitution of Engineering and Technology
Pages43-94
Number of pages52
ISBN (Electronic)9781785617973
DOIs
Publication statusPublished - 2020 Jan 1

Bibliographical note

Publisher Copyright:
© The Institution of Engineering and Technology 2020.

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint

Dive into the research topics of 'Design techniques to improve the resilience of computing systems: Architectural layer'. Together they form a unique fingerprint.

Cite this