Microarray technology has supplied a large volume of data, which changes many problems in biology into the problems of computing. As a result techniques for extracting useful information from the data are developed. In particular, microarray technology has been applied to prediction and diagnosis of cancer, so that it expectedly helps us to exactly predict and diagnose cancer. To precisely classify cancer we have to select genes related to cancer because the genes extracted from microarray have many noises. In this paper, we attempt to explore seven feature selection methods and four classifiers and propose ensemble classifiers in three benchmark datasets to systematically evaluate the performances of the feature selection methods and machine learning classifiers. Three benchmark datasets are leukemia cancer dataset, colon cancer dataset and lymphoma cancer data set. The methods to combine the classifiers are majority voting, weighted voting, and Bayesian approach to improve the performance of classification. Experimental results show that the ensemble with several basis classifiers produces the best recognition rate on the benchmark datasets.
|Number of pages||16|
|Journal||International Journal of Software Engineering and Knowledge Engineering|
|Publication status||Published - 2003 Dec|
Bibliographical noteFunding Information:
This work was supported by Biometrics Engineering Research Center and a grant of Korea Health 21 R&D Project, Ministry of Health & Welfare, Republic of Korea.
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Computer Graphics and Computer-Aided Design
- Artificial Intelligence