Towards optimal feature and classifier for gene expression classification of cancer

Jungwon Ryu, Sung Bae Cho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Recently, demand on the tools to efficiently analyze biological genomic information has been on the rise. In this paper, we attempt to explore the optimal features and classifiers through a comparative study with the most promising feature selection methods and machine learning classifiers. In order to predict the cancer class, the gene information from patient’s marrow expressed by DNA microarray, who has either the acute myeloid leukemia or acute lymphoblastic leukemia. Pearson and Spearman’s correlation, Euclidean distance, cosine coefficient, information gain, mutual information and signal to noise ratio have been used for feature selection. Backpropagation neural network, self-organizing map, structure adaptive self-organizing map, support vector machine, inductive decision tree and k-nearest neighbor have been used for classification. Experimental results indicate that backpropagation neural network with Pearson’s correlation coefficients is the best method, obtaining 97.1% of recognition rate on the test data.

Original languageEnglish
Title of host publicationAdvances in Soft Computing - AFSS 2002 - 2002 AFSS International Conference on Fuzzy Systems, Proceedings
EditorsNikhil R. Pal, Michio Sugeno
PublisherSpringer Verlag
Pages310-317
Number of pages8
ISBN (Print)9783540431503
Publication statusPublished - 2002 Jan 1
Event5th International Conference on Asian Fuzzy Systems Society, AFSS 2002 - Calcutta, India
Duration: 2002 Feb 32002 Feb 6

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2275
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other5th International Conference on Asian Fuzzy Systems Society, AFSS 2002
CountryIndia
CityCalcutta
Period02/2/302/2/6

Fingerprint

Back-propagation Neural Network
Leukemia
Self organizing maps
Self-organizing Map
Backpropagation
Gene expression
Acute
Feature Selection
Gene Expression
Feature extraction
Cancer
Classifiers
Classifier
Neural networks
Pearson Correlation
DNA Microarray
Information Gain
Microarrays
Euclidean Distance
Decision trees

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Ryu, J., & Cho, S. B. (2002). Towards optimal feature and classifier for gene expression classification of cancer. In N. R. Pal, & M. Sugeno (Eds.), Advances in Soft Computing - AFSS 2002 - 2002 AFSS International Conference on Fuzzy Systems, Proceedings (pp. 310-317). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2275). Springer Verlag.
Ryu, Jungwon ; Cho, Sung Bae. / Towards optimal feature and classifier for gene expression classification of cancer. Advances in Soft Computing - AFSS 2002 - 2002 AFSS International Conference on Fuzzy Systems, Proceedings. editor / Nikhil R. Pal ; Michio Sugeno. Springer Verlag, 2002. pp. 310-317 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{2747bf9a80fe4238b811fec8d9b7b764,
title = "Towards optimal feature and classifier for gene expression classification of cancer",
abstract = "Recently, demand on the tools to efficiently analyze biological genomic information has been on the rise. In this paper, we attempt to explore the optimal features and classifiers through a comparative study with the most promising feature selection methods and machine learning classifiers. In order to predict the cancer class, the gene information from patient’s marrow expressed by DNA microarray, who has either the acute myeloid leukemia or acute lymphoblastic leukemia. Pearson and Spearman’s correlation, Euclidean distance, cosine coefficient, information gain, mutual information and signal to noise ratio have been used for feature selection. Backpropagation neural network, self-organizing map, structure adaptive self-organizing map, support vector machine, inductive decision tree and k-nearest neighbor have been used for classification. Experimental results indicate that backpropagation neural network with Pearson’s correlation coefficients is the best method, obtaining 97.1{\%} of recognition rate on the test data.",
author = "Jungwon Ryu and Cho, {Sung Bae}",
year = "2002",
month = "1",
day = "1",
language = "English",
isbn = "9783540431503",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "310--317",
editor = "Pal, {Nikhil R.} and Michio Sugeno",
booktitle = "Advances in Soft Computing - AFSS 2002 - 2002 AFSS International Conference on Fuzzy Systems, Proceedings",
address = "Germany",

}

Ryu, J & Cho, SB 2002, Towards optimal feature and classifier for gene expression classification of cancer. in NR Pal & M Sugeno (eds), Advances in Soft Computing - AFSS 2002 - 2002 AFSS International Conference on Fuzzy Systems, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2275, Springer Verlag, pp. 310-317, 5th International Conference on Asian Fuzzy Systems Society, AFSS 2002, Calcutta, India, 02/2/3.

Towards optimal feature and classifier for gene expression classification of cancer. / Ryu, Jungwon; Cho, Sung Bae.

Advances in Soft Computing - AFSS 2002 - 2002 AFSS International Conference on Fuzzy Systems, Proceedings. ed. / Nikhil R. Pal; Michio Sugeno. Springer Verlag, 2002. p. 310-317 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2275).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Towards optimal feature and classifier for gene expression classification of cancer

AU - Ryu, Jungwon

AU - Cho, Sung Bae

PY - 2002/1/1

Y1 - 2002/1/1

N2 - Recently, demand on the tools to efficiently analyze biological genomic information has been on the rise. In this paper, we attempt to explore the optimal features and classifiers through a comparative study with the most promising feature selection methods and machine learning classifiers. In order to predict the cancer class, the gene information from patient’s marrow expressed by DNA microarray, who has either the acute myeloid leukemia or acute lymphoblastic leukemia. Pearson and Spearman’s correlation, Euclidean distance, cosine coefficient, information gain, mutual information and signal to noise ratio have been used for feature selection. Backpropagation neural network, self-organizing map, structure adaptive self-organizing map, support vector machine, inductive decision tree and k-nearest neighbor have been used for classification. Experimental results indicate that backpropagation neural network with Pearson’s correlation coefficients is the best method, obtaining 97.1% of recognition rate on the test data.

AB - Recently, demand on the tools to efficiently analyze biological genomic information has been on the rise. In this paper, we attempt to explore the optimal features and classifiers through a comparative study with the most promising feature selection methods and machine learning classifiers. In order to predict the cancer class, the gene information from patient’s marrow expressed by DNA microarray, who has either the acute myeloid leukemia or acute lymphoblastic leukemia. Pearson and Spearman’s correlation, Euclidean distance, cosine coefficient, information gain, mutual information and signal to noise ratio have been used for feature selection. Backpropagation neural network, self-organizing map, structure adaptive self-organizing map, support vector machine, inductive decision tree and k-nearest neighbor have been used for classification. Experimental results indicate that backpropagation neural network with Pearson’s correlation coefficients is the best method, obtaining 97.1% of recognition rate on the test data.

UR - http://www.scopus.com/inward/record.url?scp=23044530914&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=23044530914&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:23044530914

SN - 9783540431503

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 310

EP - 317

BT - Advances in Soft Computing - AFSS 2002 - 2002 AFSS International Conference on Fuzzy Systems, Proceedings

A2 - Pal, Nikhil R.

A2 - Sugeno, Michio

PB - Springer Verlag

ER -

Ryu J, Cho SB. Towards optimal feature and classifier for gene expression classification of cancer. In Pal NR, Sugeno M, editors, Advances in Soft Computing - AFSS 2002 - 2002 AFSS International Conference on Fuzzy Systems, Proceedings. Springer Verlag. 2002. p. 310-317. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).