Optimal gene selection for cancer classification with partial correlation and k-nearest neighbor classifier

Si Ho Yoo, Sung Bae Cho

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

High density DNA microarrays are widely used in cancer research, monitoring thousands of genes at once. Due to small sample size and the large amount of genes in micrarray experiments, selection of significant genes via expression patterns is an important matter in cancer classification. Many gene selection methods have been investigated, but it is hard to find out the perfect one. In this paper we propose a new gene selection method based on partial correlation in regression analysis to find the informative genes to predict cancer. The genes selected by this method tend to have information about the cancer that is not overlapped by the genes selected previously. We have measured the sensitivity, specificity, and recognition rate of the selected genes with k-nearest neighbor classifier for colon cancer dataset. In most of the cases, the proposed method has produced better results than the gene selection methods based on correlation coefficients, showing high accuracy of 90.3% for colon cancer dataset.

Original languageEnglish
Pages (from-to)713-722
Number of pages10
JournalLecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
Volume3157
Publication statusPublished - 2004 Dec 1
Event8th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2004: Trends in Artificial Intelligence - Auckland, New Zealand
Duration: 2004 Aug 92004 Aug 13

Fingerprint

Cancer Classification
Partial Correlation
Gene Selection
Nearest Neighbor
Classifiers
Genes
Classifier
Cancer
Gene
DNA Microarray
Small Sample Size
Regression Analysis
Correlation coefficient
Gene Expression
Specificity
High Accuracy
Tend
Monitoring
Predict
Microarrays

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

@article{ecdc91c7c1e14b28828039e1f6f83e26,
title = "Optimal gene selection for cancer classification with partial correlation and k-nearest neighbor classifier",
abstract = "High density DNA microarrays are widely used in cancer research, monitoring thousands of genes at once. Due to small sample size and the large amount of genes in micrarray experiments, selection of significant genes via expression patterns is an important matter in cancer classification. Many gene selection methods have been investigated, but it is hard to find out the perfect one. In this paper we propose a new gene selection method based on partial correlation in regression analysis to find the informative genes to predict cancer. The genes selected by this method tend to have information about the cancer that is not overlapped by the genes selected previously. We have measured the sensitivity, specificity, and recognition rate of the selected genes with k-nearest neighbor classifier for colon cancer dataset. In most of the cases, the proposed method has produced better results than the gene selection methods based on correlation coefficients, showing high accuracy of 90.3{\%} for colon cancer dataset.",
author = "Yoo, {Si Ho} and Cho, {Sung Bae}",
year = "2004",
month = "12",
day = "1",
language = "English",
volume = "3157",
pages = "713--722",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Optimal gene selection for cancer classification with partial correlation and k-nearest neighbor classifier

AU - Yoo, Si Ho

AU - Cho, Sung Bae

PY - 2004/12/1

Y1 - 2004/12/1

N2 - High density DNA microarrays are widely used in cancer research, monitoring thousands of genes at once. Due to small sample size and the large amount of genes in micrarray experiments, selection of significant genes via expression patterns is an important matter in cancer classification. Many gene selection methods have been investigated, but it is hard to find out the perfect one. In this paper we propose a new gene selection method based on partial correlation in regression analysis to find the informative genes to predict cancer. The genes selected by this method tend to have information about the cancer that is not overlapped by the genes selected previously. We have measured the sensitivity, specificity, and recognition rate of the selected genes with k-nearest neighbor classifier for colon cancer dataset. In most of the cases, the proposed method has produced better results than the gene selection methods based on correlation coefficients, showing high accuracy of 90.3% for colon cancer dataset.

AB - High density DNA microarrays are widely used in cancer research, monitoring thousands of genes at once. Due to small sample size and the large amount of genes in micrarray experiments, selection of significant genes via expression patterns is an important matter in cancer classification. Many gene selection methods have been investigated, but it is hard to find out the perfect one. In this paper we propose a new gene selection method based on partial correlation in regression analysis to find the informative genes to predict cancer. The genes selected by this method tend to have information about the cancer that is not overlapped by the genes selected previously. We have measured the sensitivity, specificity, and recognition rate of the selected genes with k-nearest neighbor classifier for colon cancer dataset. In most of the cases, the proposed method has produced better results than the gene selection methods based on correlation coefficients, showing high accuracy of 90.3% for colon cancer dataset.

UR - http://www.scopus.com/inward/record.url?scp=22944446761&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=22944446761&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:22944446761

VL - 3157

SP - 713

EP - 722

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

ER -