RHSBoost

Improving classification performance in imbalance data

Joonho Gong, Hyunjoong Kim

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

Imbalance data are defined as a dataset whose proportion of classes is severely skewed. Classification performance of existing models tends to deteriorate due to class distribution imbalance. In addition, over-representation by majority classes prevents a classifier from paying attention to minority classes, which are generally more interesting. An effective ensemble classification method called RHSBoost has been proposed to address the imbalance classification problem. This classification rule uses random undersampling and ROSE sampling under a boosting scheme. According to the experimental results, RHSBoost appears to be an attractive classification model for imbalance data.

Original languageEnglish
Pages (from-to)1-13
Number of pages13
JournalComputational Statistics and Data Analysis
Volume111
DOIs
Publication statusPublished - 2017 Jul 1

Fingerprint

Classification Rules
Boosting
Classification Problems
Ensemble
Proportion
Classifier
Tend
Classifiers
Class
Sampling
Experimental Results
Model

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

Cite this

@article{7be7611ac8fd449898be2ac9761c998c,
title = "RHSBoost: Improving classification performance in imbalance data",
abstract = "Imbalance data are defined as a dataset whose proportion of classes is severely skewed. Classification performance of existing models tends to deteriorate due to class distribution imbalance. In addition, over-representation by majority classes prevents a classifier from paying attention to minority classes, which are generally more interesting. An effective ensemble classification method called RHSBoost has been proposed to address the imbalance classification problem. This classification rule uses random undersampling and ROSE sampling under a boosting scheme. According to the experimental results, RHSBoost appears to be an attractive classification model for imbalance data.",
author = "Joonho Gong and Hyunjoong Kim",
year = "2017",
month = "7",
day = "1",
doi = "10.1016/j.csda.2017.01.005",
language = "English",
volume = "111",
pages = "1--13",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",

}

RHSBoost : Improving classification performance in imbalance data. / Gong, Joonho; Kim, Hyunjoong.

In: Computational Statistics and Data Analysis, Vol. 111, 01.07.2017, p. 1-13.

Research output: Contribution to journalArticle

TY - JOUR

T1 - RHSBoost

T2 - Improving classification performance in imbalance data

AU - Gong, Joonho

AU - Kim, Hyunjoong

PY - 2017/7/1

Y1 - 2017/7/1

N2 - Imbalance data are defined as a dataset whose proportion of classes is severely skewed. Classification performance of existing models tends to deteriorate due to class distribution imbalance. In addition, over-representation by majority classes prevents a classifier from paying attention to minority classes, which are generally more interesting. An effective ensemble classification method called RHSBoost has been proposed to address the imbalance classification problem. This classification rule uses random undersampling and ROSE sampling under a boosting scheme. According to the experimental results, RHSBoost appears to be an attractive classification model for imbalance data.

AB - Imbalance data are defined as a dataset whose proportion of classes is severely skewed. Classification performance of existing models tends to deteriorate due to class distribution imbalance. In addition, over-representation by majority classes prevents a classifier from paying attention to minority classes, which are generally more interesting. An effective ensemble classification method called RHSBoost has been proposed to address the imbalance classification problem. This classification rule uses random undersampling and ROSE sampling under a boosting scheme. According to the experimental results, RHSBoost appears to be an attractive classification model for imbalance data.

UR - http://www.scopus.com/inward/record.url?scp=85012075581&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012075581&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2017.01.005

DO - 10.1016/j.csda.2017.01.005

M3 - Article

VL - 111

SP - 1

EP - 13

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

ER -