Exploiting the categorical reliability difference for binary classification

Lei Sun, Kar Ann Toh, Badong Chen, Zhiping Lin

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

In binary pattern classification, the reliabilities of statistics obtained from the samples of the two categories are generally different. When the statistics are used for modeling a classifier, such reliability difference could impact the generalization performance. We formulate a disparity index to show the statistical disparity based on the generalized eigenvalue decomposition of the categorical moment matrices. It is shown that this disparity index can effectively indicate the reliability difference between the two categories. The obtained reliability difference is subsequently utilized to adjust the regularization term of a classifier for effective learning generalization. Our experiments based on 10 real-world benchmark data sets validate the effectiveness of the proposed method.

Original languageEnglish
Pages (from-to)2022-2040
Number of pages19
JournalJournal of the Franklin Institute
Volume355
Issue number4
DOIs
Publication statusPublished - 2018 Mar 1

Fingerprint

Binary Classification
Categorical
Classifiers
Classifier
Statistics
Eigenvalue Decomposition
Generalized Eigenvalue
Moment Matrix
Pattern Classification
Pattern recognition
Regularization
Benchmark
Decomposition
Term
Modeling
Experiment
Experiments
Generalization

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Signal Processing
  • Computer Networks and Communications
  • Applied Mathematics

Cite this

Sun, Lei ; Toh, Kar Ann ; Chen, Badong ; Lin, Zhiping. / Exploiting the categorical reliability difference for binary classification. In: Journal of the Franklin Institute. 2018 ; Vol. 355, No. 4. pp. 2022-2040.
@article{3fde04d9445c43feb4e563f16d0640fd,
title = "Exploiting the categorical reliability difference for binary classification",
abstract = "In binary pattern classification, the reliabilities of statistics obtained from the samples of the two categories are generally different. When the statistics are used for modeling a classifier, such reliability difference could impact the generalization performance. We formulate a disparity index to show the statistical disparity based on the generalized eigenvalue decomposition of the categorical moment matrices. It is shown that this disparity index can effectively indicate the reliability difference between the two categories. The obtained reliability difference is subsequently utilized to adjust the regularization term of a classifier for effective learning generalization. Our experiments based on 10 real-world benchmark data sets validate the effectiveness of the proposed method.",
author = "Lei Sun and Toh, {Kar Ann} and Badong Chen and Zhiping Lin",
year = "2018",
month = "3",
day = "1",
doi = "10.1016/j.jfranklin.2017.11.024",
language = "English",
volume = "355",
pages = "2022--2040",
journal = "Journal of the Franklin Institute",
issn = "0016-0032",
publisher = "Elsevier Limited",
number = "4",

}

Exploiting the categorical reliability difference for binary classification. / Sun, Lei; Toh, Kar Ann; Chen, Badong; Lin, Zhiping.

In: Journal of the Franklin Institute, Vol. 355, No. 4, 01.03.2018, p. 2022-2040.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Exploiting the categorical reliability difference for binary classification

AU - Sun, Lei

AU - Toh, Kar Ann

AU - Chen, Badong

AU - Lin, Zhiping

PY - 2018/3/1

Y1 - 2018/3/1

N2 - In binary pattern classification, the reliabilities of statistics obtained from the samples of the two categories are generally different. When the statistics are used for modeling a classifier, such reliability difference could impact the generalization performance. We formulate a disparity index to show the statistical disparity based on the generalized eigenvalue decomposition of the categorical moment matrices. It is shown that this disparity index can effectively indicate the reliability difference between the two categories. The obtained reliability difference is subsequently utilized to adjust the regularization term of a classifier for effective learning generalization. Our experiments based on 10 real-world benchmark data sets validate the effectiveness of the proposed method.

AB - In binary pattern classification, the reliabilities of statistics obtained from the samples of the two categories are generally different. When the statistics are used for modeling a classifier, such reliability difference could impact the generalization performance. We formulate a disparity index to show the statistical disparity based on the generalized eigenvalue decomposition of the categorical moment matrices. It is shown that this disparity index can effectively indicate the reliability difference between the two categories. The obtained reliability difference is subsequently utilized to adjust the regularization term of a classifier for effective learning generalization. Our experiments based on 10 real-world benchmark data sets validate the effectiveness of the proposed method.

UR - http://www.scopus.com/inward/record.url?scp=85038849332&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85038849332&partnerID=8YFLogxK

U2 - 10.1016/j.jfranklin.2017.11.024

DO - 10.1016/j.jfranklin.2017.11.024

M3 - Article

AN - SCOPUS:85038849332

VL - 355

SP - 2022

EP - 2040

JO - Journal of the Franklin Institute

JF - Journal of the Franklin Institute

SN - 0016-0032

IS - 4

ER -