Sparse HDLSS discrimination with constrained data piling

Jeongyoun Ahn, Yongho Jeon

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Abstract Regularization is a key component in high dimensional data analyses. In high dimensional discrimination with binary classes, the phenomenon of data piling occurs when the projection of data onto a discriminant vector is dichotomous, one for each class. Regularizing the degree of data piling yields a new class of discrimination rules for high dimension-low sample size data. A discrimination method that regularizes the degree of data piling while achieving sparsity is proposed and solved via a linear programming. Computational efficiency is further improved by a sign-preserving regularization that forces the signs of the estimator to be the same as the mean difference. The proposed classifier shows competitive performances for simulated and real data examples including speech recognition and gene expressions.

Original languageEnglish
Article number6079
Pages (from-to)74-83
Number of pages10
JournalComputational Statistics and Data Analysis
Volume90
DOIs
Publication statusPublished - 2015 Apr 28

Fingerprint

Discrimination
Piles
Computational efficiency
Speech recognition
Gene expression
Linear programming
Regularization
Classifiers
High-dimensional Data
Speech Recognition
Sparsity
Discriminant
Computational Efficiency
Higher Dimensions
Gene Expression
Sample Size
High-dimensional
Classifier
Projection
Binary

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

Cite this

@article{51471b97b07f45fda4152e2928e6dcb7,
title = "Sparse HDLSS discrimination with constrained data piling",
abstract = "Abstract Regularization is a key component in high dimensional data analyses. In high dimensional discrimination with binary classes, the phenomenon of data piling occurs when the projection of data onto a discriminant vector is dichotomous, one for each class. Regularizing the degree of data piling yields a new class of discrimination rules for high dimension-low sample size data. A discrimination method that regularizes the degree of data piling while achieving sparsity is proposed and solved via a linear programming. Computational efficiency is further improved by a sign-preserving regularization that forces the signs of the estimator to be the same as the mean difference. The proposed classifier shows competitive performances for simulated and real data examples including speech recognition and gene expressions.",
author = "Jeongyoun Ahn and Yongho Jeon",
year = "2015",
month = "4",
day = "28",
doi = "10.1016/j.csda.2015.04.006",
language = "English",
volume = "90",
pages = "74--83",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",

}

Sparse HDLSS discrimination with constrained data piling. / Ahn, Jeongyoun; Jeon, Yongho.

In: Computational Statistics and Data Analysis, Vol. 90, 6079, 28.04.2015, p. 74-83.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Sparse HDLSS discrimination with constrained data piling

AU - Ahn, Jeongyoun

AU - Jeon, Yongho

PY - 2015/4/28

Y1 - 2015/4/28

N2 - Abstract Regularization is a key component in high dimensional data analyses. In high dimensional discrimination with binary classes, the phenomenon of data piling occurs when the projection of data onto a discriminant vector is dichotomous, one for each class. Regularizing the degree of data piling yields a new class of discrimination rules for high dimension-low sample size data. A discrimination method that regularizes the degree of data piling while achieving sparsity is proposed and solved via a linear programming. Computational efficiency is further improved by a sign-preserving regularization that forces the signs of the estimator to be the same as the mean difference. The proposed classifier shows competitive performances for simulated and real data examples including speech recognition and gene expressions.

AB - Abstract Regularization is a key component in high dimensional data analyses. In high dimensional discrimination with binary classes, the phenomenon of data piling occurs when the projection of data onto a discriminant vector is dichotomous, one for each class. Regularizing the degree of data piling yields a new class of discrimination rules for high dimension-low sample size data. A discrimination method that regularizes the degree of data piling while achieving sparsity is proposed and solved via a linear programming. Computational efficiency is further improved by a sign-preserving regularization that forces the signs of the estimator to be the same as the mean difference. The proposed classifier shows competitive performances for simulated and real data examples including speech recognition and gene expressions.

UR - http://www.scopus.com/inward/record.url?scp=84929193188&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84929193188&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2015.04.006

DO - 10.1016/j.csda.2015.04.006

M3 - Article

VL - 90

SP - 74

EP - 83

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

M1 - 6079

ER -