In recent years, the semiconductor manufacturing industry has recognized class imbalance as a major impediment to the development of high-performance fault detection (FD) models. Class imbalance refers to skews in class distribution in which normal wafer samples are considerably more abundant than fault samples. In such a situation, standard machine learning algorithms create FD models with classification boundaries that are biased toward majority-class data, resulting in high type II error rates. In this paper, we compare the performance of machine learning algorithms for class-imbalanced FD problems. We evaluate the performance of three sampling-based algorithms, four ensemble algorithms, four instance-based algorithms, and two support vector machine algorithms. Two experiments were conducted to compare algorithm performance using etching process data and chemical vapor deposition process data. Different data scenarios were considered by setting the imbalance ratio to three levels. The results of the experiments indicated that the instance-based algorithms presented excellent performance even when the imbalance ratio increased.
All Science Journal Classification (ASJC) codes
- Electronic, Optical and Magnetic Materials
- Condensed Matter Physics
- Industrial and Manufacturing Engineering
- Electrical and Electronic Engineering