Accurate ensemble pruning with PL-bagging

Dongjun Chung, Hyunjoong Kim

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Ensemble pruning deals with the selection of base learners prior to combination in order to improve prediction accuracy and efficiency. In the ensemble literature, it has been pointed out that in order for an ensemble classifier to achieve higher prediction accuracy, it is critical for the ensemble classifier to consist of accurate classifiers which at the same time diverse as much as possible. In this paper, a novel ensemble pruning method, called PL-bagging, is proposed. In order to attain the balance between diversity and accuracy of base learners, PL-bagging employs positive Lasso to assign weights to base learners in the combination step. Simulation studies and theoretical investigation showed that PL-bagging filters out redundant base learners while it assigns higher weights to more accurate base learners. Such improved weighting scheme of PL-bagging further results in higher classification accuracy and the improvement becomes even more significant as the ensemble size increases. The performance of PL-bagging was compared with state-of-the-art ensemble pruning methods for aggregation of bootstrapped base learners using 22 real and 4 synthetic datasets. The results indicate that PL-bagging significantly outperforms state-of-the-art ensemble pruning methods such as Boosting-based pruning and Trimmed bagging.

Original languageEnglish
Pages (from-to)1-13
Number of pages13
JournalComputational Statistics and Data Analysis
Volume83
DOIs
Publication statusPublished - 2015 Jan 1

Fingerprint

Bagging
Pruning
Ensemble
Classifiers
Ensemble Classifier
Agglomeration
Assign
Lasso
Prediction
Boosting
Weighting
Aggregation
Classifier
Simulation Study
Filter

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Computational Theory and Mathematics
  • Computational Mathematics
  • Applied Mathematics

Cite this

@article{0ab9c53e303741c8835ad52adf8ac4be,
title = "Accurate ensemble pruning with PL-bagging",
abstract = "Ensemble pruning deals with the selection of base learners prior to combination in order to improve prediction accuracy and efficiency. In the ensemble literature, it has been pointed out that in order for an ensemble classifier to achieve higher prediction accuracy, it is critical for the ensemble classifier to consist of accurate classifiers which at the same time diverse as much as possible. In this paper, a novel ensemble pruning method, called PL-bagging, is proposed. In order to attain the balance between diversity and accuracy of base learners, PL-bagging employs positive Lasso to assign weights to base learners in the combination step. Simulation studies and theoretical investigation showed that PL-bagging filters out redundant base learners while it assigns higher weights to more accurate base learners. Such improved weighting scheme of PL-bagging further results in higher classification accuracy and the improvement becomes even more significant as the ensemble size increases. The performance of PL-bagging was compared with state-of-the-art ensemble pruning methods for aggregation of bootstrapped base learners using 22 real and 4 synthetic datasets. The results indicate that PL-bagging significantly outperforms state-of-the-art ensemble pruning methods such as Boosting-based pruning and Trimmed bagging.",
author = "Dongjun Chung and Hyunjoong Kim",
year = "2015",
month = "1",
day = "1",
doi = "10.1016/j.csda.2014.09.003",
language = "English",
volume = "83",
pages = "1--13",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",

}

Accurate ensemble pruning with PL-bagging. / Chung, Dongjun; Kim, Hyunjoong.

In: Computational Statistics and Data Analysis, Vol. 83, 01.01.2015, p. 1-13.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Accurate ensemble pruning with PL-bagging

AU - Chung, Dongjun

AU - Kim, Hyunjoong

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Ensemble pruning deals with the selection of base learners prior to combination in order to improve prediction accuracy and efficiency. In the ensemble literature, it has been pointed out that in order for an ensemble classifier to achieve higher prediction accuracy, it is critical for the ensemble classifier to consist of accurate classifiers which at the same time diverse as much as possible. In this paper, a novel ensemble pruning method, called PL-bagging, is proposed. In order to attain the balance between diversity and accuracy of base learners, PL-bagging employs positive Lasso to assign weights to base learners in the combination step. Simulation studies and theoretical investigation showed that PL-bagging filters out redundant base learners while it assigns higher weights to more accurate base learners. Such improved weighting scheme of PL-bagging further results in higher classification accuracy and the improvement becomes even more significant as the ensemble size increases. The performance of PL-bagging was compared with state-of-the-art ensemble pruning methods for aggregation of bootstrapped base learners using 22 real and 4 synthetic datasets. The results indicate that PL-bagging significantly outperforms state-of-the-art ensemble pruning methods such as Boosting-based pruning and Trimmed bagging.

AB - Ensemble pruning deals with the selection of base learners prior to combination in order to improve prediction accuracy and efficiency. In the ensemble literature, it has been pointed out that in order for an ensemble classifier to achieve higher prediction accuracy, it is critical for the ensemble classifier to consist of accurate classifiers which at the same time diverse as much as possible. In this paper, a novel ensemble pruning method, called PL-bagging, is proposed. In order to attain the balance between diversity and accuracy of base learners, PL-bagging employs positive Lasso to assign weights to base learners in the combination step. Simulation studies and theoretical investigation showed that PL-bagging filters out redundant base learners while it assigns higher weights to more accurate base learners. Such improved weighting scheme of PL-bagging further results in higher classification accuracy and the improvement becomes even more significant as the ensemble size increases. The performance of PL-bagging was compared with state-of-the-art ensemble pruning methods for aggregation of bootstrapped base learners using 22 real and 4 synthetic datasets. The results indicate that PL-bagging significantly outperforms state-of-the-art ensemble pruning methods such as Boosting-based pruning and Trimmed bagging.

UR - http://www.scopus.com/inward/record.url?scp=84908377008&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84908377008&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2014.09.003

DO - 10.1016/j.csda.2014.09.003

M3 - Article

VL - 83

SP - 1

EP - 13

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

ER -