An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data

Jueun Kwak, Taehyung Lee, Chang Ouk Kim

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

Training fault detection model requires advanced data-mining algorithms when the growth rate of the process data is notably high and normal-class data overwhelm fault-class data in number. Most standard classification algorithms, such as support vector machines (SVMs), can handle moderate sizes of training data and assume balanced class distributions. When the class sizes are highly imbalanced, the standard algorithms tend to strongly favor the majority class and provide a notably low detection of the minority class as a result. In this paper, we propose an online fault detection algorithm based on incremental clustering. The algorithm accurately finds wafer faults even in severe class distribution skews and efficiently processes massive sensor data in terms of reductions in the required storage. We tested our algorithm on illustrative examples and an industrial example. The algorithm performed well with the illustrative examples that included imbalanced class distributions of Gaussian and non-Gaussian types and process drifts. In the industrial example, which simulated real data from a plasma etcher, we verified that the performance of the algorithm was better than that of the standard SVM, one-class SVM and three instance-based fault detection algorithms that are typically used in the literature.

Original languageEnglish
Article number7123674
Pages (from-to)318-328
Number of pages11
JournalIEEE Transactions on Semiconductor Manufacturing
Volume28
Issue number3
DOIs
Publication statusPublished - 2015 Aug 1

Fingerprint

fault detection
Fault detection
Support vector machines
education
data mining
minorities
Data mining
wafers
Plasmas
sensors
Sensors

All Science Journal Classification (ASJC) codes

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Industrial and Manufacturing Engineering
  • Electrical and Electronic Engineering

Cite this

@article{6f0d253822cc4815a8d02003532a0a54,
title = "An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data",
abstract = "Training fault detection model requires advanced data-mining algorithms when the growth rate of the process data is notably high and normal-class data overwhelm fault-class data in number. Most standard classification algorithms, such as support vector machines (SVMs), can handle moderate sizes of training data and assume balanced class distributions. When the class sizes are highly imbalanced, the standard algorithms tend to strongly favor the majority class and provide a notably low detection of the minority class as a result. In this paper, we propose an online fault detection algorithm based on incremental clustering. The algorithm accurately finds wafer faults even in severe class distribution skews and efficiently processes massive sensor data in terms of reductions in the required storage. We tested our algorithm on illustrative examples and an industrial example. The algorithm performed well with the illustrative examples that included imbalanced class distributions of Gaussian and non-Gaussian types and process drifts. In the industrial example, which simulated real data from a plasma etcher, we verified that the performance of the algorithm was better than that of the standard SVM, one-class SVM and three instance-based fault detection algorithms that are typically used in the literature.",
author = "Jueun Kwak and Taehyung Lee and Kim, {Chang Ouk}",
year = "2015",
month = "8",
day = "1",
doi = "10.1109/TSM.2015.2445380",
language = "English",
volume = "28",
pages = "318--328",
journal = "IEEE Transactions on Semiconductor Manufacturing",
issn = "0894-6507",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data. / Kwak, Jueun; Lee, Taehyung; Kim, Chang Ouk.

In: IEEE Transactions on Semiconductor Manufacturing, Vol. 28, No. 3, 7123674, 01.08.2015, p. 318-328.

Research output: Contribution to journalArticle

TY - JOUR

T1 - An Incremental Clustering-Based Fault Detection Algorithm for Class-Imbalanced Process Data

AU - Kwak, Jueun

AU - Lee, Taehyung

AU - Kim, Chang Ouk

PY - 2015/8/1

Y1 - 2015/8/1

N2 - Training fault detection model requires advanced data-mining algorithms when the growth rate of the process data is notably high and normal-class data overwhelm fault-class data in number. Most standard classification algorithms, such as support vector machines (SVMs), can handle moderate sizes of training data and assume balanced class distributions. When the class sizes are highly imbalanced, the standard algorithms tend to strongly favor the majority class and provide a notably low detection of the minority class as a result. In this paper, we propose an online fault detection algorithm based on incremental clustering. The algorithm accurately finds wafer faults even in severe class distribution skews and efficiently processes massive sensor data in terms of reductions in the required storage. We tested our algorithm on illustrative examples and an industrial example. The algorithm performed well with the illustrative examples that included imbalanced class distributions of Gaussian and non-Gaussian types and process drifts. In the industrial example, which simulated real data from a plasma etcher, we verified that the performance of the algorithm was better than that of the standard SVM, one-class SVM and three instance-based fault detection algorithms that are typically used in the literature.

AB - Training fault detection model requires advanced data-mining algorithms when the growth rate of the process data is notably high and normal-class data overwhelm fault-class data in number. Most standard classification algorithms, such as support vector machines (SVMs), can handle moderate sizes of training data and assume balanced class distributions. When the class sizes are highly imbalanced, the standard algorithms tend to strongly favor the majority class and provide a notably low detection of the minority class as a result. In this paper, we propose an online fault detection algorithm based on incremental clustering. The algorithm accurately finds wafer faults even in severe class distribution skews and efficiently processes massive sensor data in terms of reductions in the required storage. We tested our algorithm on illustrative examples and an industrial example. The algorithm performed well with the illustrative examples that included imbalanced class distributions of Gaussian and non-Gaussian types and process drifts. In the industrial example, which simulated real data from a plasma etcher, we verified that the performance of the algorithm was better than that of the standard SVM, one-class SVM and three instance-based fault detection algorithms that are typically used in the literature.

UR - http://www.scopus.com/inward/record.url?scp=84938791486&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84938791486&partnerID=8YFLogxK

U2 - 10.1109/TSM.2015.2445380

DO - 10.1109/TSM.2015.2445380

M3 - Article

VL - 28

SP - 318

EP - 328

JO - IEEE Transactions on Semiconductor Manufacturing

JF - IEEE Transactions on Semiconductor Manufacturing

SN - 0894-6507

IS - 3

M1 - 7123674

ER -