Experimental study for the comparison of classifier combination methods

S. Y. Sohn, H. W. Shin

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

In this paper, we compare the performances of classifier combination methods (bagging, modified random subspace method, classifier selection, parametric fusion) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are: (a) combination function among input variables, (b) correlation between input variables, (c) variance of observation, and (d) training data set size. In view of typically unknown combination function among input variables, we use a Taguchi design to improve the practicality of our study results by letting it as an uncontrollable factor. Our experimental study results indicate the following: when training set size is large, performances of logistic regression and bagging are not significantly different. However, when training set size is small, the performance of logistic regression is worse than bagging. When training data set size is small and correlation is strong, both modified random subspace method and bagging perform better than the other three methods. When correlation is weak and variance is small, both parametric fusion and classifier selection algorithm appear to be the worst at our disappointment.

Original languageEnglish
Pages (from-to)33-40
Number of pages8
JournalPattern Recognition
Volume40
Issue number1
DOIs
Publication statusPublished - 2007 Jan 1

Fingerprint

Logistics
Classifiers
Fusion reactions

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Cite this

@article{4705149adbd84ea1b6e3036fb33a7001,
title = "Experimental study for the comparison of classifier combination methods",
abstract = "In this paper, we compare the performances of classifier combination methods (bagging, modified random subspace method, classifier selection, parametric fusion) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are: (a) combination function among input variables, (b) correlation between input variables, (c) variance of observation, and (d) training data set size. In view of typically unknown combination function among input variables, we use a Taguchi design to improve the practicality of our study results by letting it as an uncontrollable factor. Our experimental study results indicate the following: when training set size is large, performances of logistic regression and bagging are not significantly different. However, when training set size is small, the performance of logistic regression is worse than bagging. When training data set size is small and correlation is strong, both modified random subspace method and bagging perform better than the other three methods. When correlation is weak and variance is small, both parametric fusion and classifier selection algorithm appear to be the worst at our disappointment.",
author = "Sohn, {S. Y.} and Shin, {H. W.}",
year = "2007",
month = "1",
day = "1",
doi = "10.1016/j.patcog.2006.06.027",
language = "English",
volume = "40",
pages = "33--40",
journal = "Pattern Recognition",
issn = "0031-3203",
publisher = "Elsevier Limited",
number = "1",

}

Experimental study for the comparison of classifier combination methods. / Sohn, S. Y.; Shin, H. W.

In: Pattern Recognition, Vol. 40, No. 1, 01.01.2007, p. 33-40.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Experimental study for the comparison of classifier combination methods

AU - Sohn, S. Y.

AU - Shin, H. W.

PY - 2007/1/1

Y1 - 2007/1/1

N2 - In this paper, we compare the performances of classifier combination methods (bagging, modified random subspace method, classifier selection, parametric fusion) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are: (a) combination function among input variables, (b) correlation between input variables, (c) variance of observation, and (d) training data set size. In view of typically unknown combination function among input variables, we use a Taguchi design to improve the practicality of our study results by letting it as an uncontrollable factor. Our experimental study results indicate the following: when training set size is large, performances of logistic regression and bagging are not significantly different. However, when training set size is small, the performance of logistic regression is worse than bagging. When training data set size is small and correlation is strong, both modified random subspace method and bagging perform better than the other three methods. When correlation is weak and variance is small, both parametric fusion and classifier selection algorithm appear to be the worst at our disappointment.

AB - In this paper, we compare the performances of classifier combination methods (bagging, modified random subspace method, classifier selection, parametric fusion) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are: (a) combination function among input variables, (b) correlation between input variables, (c) variance of observation, and (d) training data set size. In view of typically unknown combination function among input variables, we use a Taguchi design to improve the practicality of our study results by letting it as an uncontrollable factor. Our experimental study results indicate the following: when training set size is large, performances of logistic regression and bagging are not significantly different. However, when training set size is small, the performance of logistic regression is worse than bagging. When training data set size is small and correlation is strong, both modified random subspace method and bagging perform better than the other three methods. When correlation is weak and variance is small, both parametric fusion and classifier selection algorithm appear to be the worst at our disappointment.

UR - http://www.scopus.com/inward/record.url?scp=33749250604&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33749250604&partnerID=8YFLogxK

U2 - 10.1016/j.patcog.2006.06.027

DO - 10.1016/j.patcog.2006.06.027

M3 - Article

AN - SCOPUS:33749250604

VL - 40

SP - 33

EP - 40

JO - Pattern Recognition

JF - Pattern Recognition

SN - 0031-3203

IS - 1

ER -