Performance of Machine Learning Algorithms for Class-Imbalanced Process Fault Detection Problems

Taehyung Lee, Ki Bum Lee, Chang Ouk Kim

Research output: Contribution to journalArticlepeer-review

41 Citations (Scopus)


In recent years, the semiconductor manufacturing industry has recognized class imbalance as a major impediment to the development of high-performance fault detection (FD) models. Class imbalance refers to skews in class distribution in which normal wafer samples are considerably more abundant than fault samples. In such a situation, standard machine learning algorithms create FD models with classification boundaries that are biased toward majority-class data, resulting in high type II error rates. In this paper, we compare the performance of machine learning algorithms for class-imbalanced FD problems. We evaluate the performance of three sampling-based algorithms, four ensemble algorithms, four instance-based algorithms, and two support vector machine algorithms. Two experiments were conducted to compare algorithm performance using etching process data and chemical vapor deposition process data. Different data scenarios were considered by setting the imbalance ratio to three levels. The results of the experiments indicated that the instance-based algorithms presented excellent performance even when the imbalance ratio increased.

Original languageEnglish
Article number7549079
Pages (from-to)436-445
Number of pages10
JournalIEEE Transactions on Semiconductor Manufacturing
Issue number4
Publication statusPublished - 2016 Nov

Bibliographical note

Publisher Copyright:
© 1988-2012 IEEE.

All Science Journal Classification (ASJC) codes

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Industrial and Manufacturing Engineering
  • Electrical and Electronic Engineering


Dive into the research topics of 'Performance of Machine Learning Algorithms for Class-Imbalanced Process Fault Detection Problems'. Together they form a unique fingerprint.

Cite this