Abstract Regularization is a key component in high dimensional data analyses. In high dimensional discrimination with binary classes, the phenomenon of data piling occurs when the projection of data onto a discriminant vector is dichotomous, one for each class. Regularizing the degree of data piling yields a new class of discrimination rules for high dimension-low sample size data. A discrimination method that regularizes the degree of data piling while achieving sparsity is proposed and solved via a linear programming. Computational efficiency is further improved by a sign-preserving regularization that forces the signs of the estimator to be the same as the mean difference. The proposed classifier shows competitive performances for simulated and real data examples including speech recognition and gene expressions.
Bibliographical noteFunding Information:
The authors are grateful to the reviewers and the associate editor for many useful comments that improved the paper. Jeon’s research was supported by Basic Science Research Program of the National Research Foundation of Korea ( NRF-2012R1A1A1012043 ) funded by the Korean government.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Computational Mathematics
- Computational Theory and Mathematics
- Applied Mathematics