Since accurate classification of DNA microarray is a very important issue for the treatment of cancer, it is more desirable to make a decision by combining the results of various expert classifiers rather than by depending on the result of only one classifier. In spite of the many advantages of mutually error-correlated ensemble classifiers, they are limited in performance. It is difficult to create an optimal ensemble for DNA analysis that deals with few samples with large features. Usually, different feature sets are provided to learn the components of the ensemble expecting the improvement of classifiers. If the feature sets provide similar information, the combination of the classifiers trained from them cannot improve the performance because they will make the same error and there is no possibility of compensation. In this paper, we adopt correlation analysis of feature selection methods as a guideline of the separation of features to learn the components of ensemble. We propose two different correlation methods for the generation of feature sets to learn ensemble classifiers. Each ensemble classifier combines several other classifiers learned from different features and based on correlation analysis to classify cancer precisely. In this way, it is possible to systematically evaluate the performance of the proposed method with three benchmark datasets. Experimental results show that two ensemble classifiers whose components are learned from different feature sets that are negatively or complementarily correlated with each other produce the best recognition rates on the three benchmark datasets.
Bibliographical noteFunding Information:
This research was supported by Brain Science and Engineering Research Program sponsored by Korean Ministry of Commerce, Industry and Energy.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Cognitive Neuroscience
- Artificial Intelligence