Experimental evaluation of failure-detection schemes in real-time communication networks

Seungjae Han, Kang G. Shin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)

Abstract

An effective failure-detection scheme is essential for reliable communication services. Most computer network rely on behavior-based detection schemes: each node uses heartbeats to detect the failure of its neighbor nodes, and the transport protocol (like TCP) achieves reliable communication by acknowledgment/retransmission. In this paper, we experimentally evaluate the effectiveness of such behavior-based detection schemes in real-time communication. Specifically, we measure and analyze the coverage and latency of two failure-detection schemes-neighbor detection and end-to-end detection-through fault-injection experiments. The experimental results have shown that a significant portion of failures can be detected very quickly by the neighbor detection scheme, while the end-to-end detection scheme uncovers the remaining failures with larger detection latencies.

Original languageEnglish
Title of host publicationDigest of Papers - 27th Annual International Symposium on Fault-Tolerant Computing, FTCS 1997
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages122-131
Number of pages10
ISBN (Electronic)0818678313, 9780818678318
DOIs
Publication statusPublished - 1997
Event27th Annual International Symposium on Fault-Tolerant Computing, FTCS 1997 - Seattle, United States
Duration: 1997 Jun 241997 Jun 27

Publication series

NameDigest of Papers - 27th Annual International Symposium on Fault-Tolerant Computing, FTCS 1997

Other

Other27th Annual International Symposium on Fault-Tolerant Computing, FTCS 1997
CountryUnited States
CitySeattle
Period97/6/2497/6/27

Bibliographical note

Funding Information:
The work reported in this paper was supported in part by the Advanced Research Projects Agency, monitored by the US Airforce Rome Laboratory under Grant F30602-95-1-0044, the National Science Foundation under Grant MIP-9203895 and the Office of Naval Research under Grant N00014-94-1-0229. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.

Funding Information:
*The work reported in this paper was supported in part by the Advanced Research Projects Agency, monitored by the US Airforce Rome Laboratory under Grant F30602-95-1-0044, the National Science Foundation under Grant MIP-9203895 and the Office of Naval Research under Grant N00014-94-1-0229. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Hardware and Architecture
  • Software
  • Safety, Risk, Reliability and Quality

Fingerprint Dive into the research topics of 'Experimental evaluation of failure-detection schemes in real-time communication networks'. Together they form a unique fingerprint.

Cite this