Density-Dependent Quantized Least Squares Support Vector Machine for Large Data Sets

Shengyu Nan, Lei Sun, Badong Chen, Zhiping Lin, Kar Ann Toh

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

Based on the knowledge that input data distribution is important for learning, a data density-dependent quantization scheme (DQS) is proposed for sparse input data representation. The usefulness of the representation scheme is demonstrated by using it as a data preprocessing unit attached to the well-known least squares support vector machine (LS-SVM) for application on big data sets. Essentially, the proposed DQS adopts a single shrinkage threshold to obtain a simple quantization scheme, which adapts its outputs to input data density. With this quantization scheme, a large data set is quantized to a small subset where considerable sample size reduction is generally obtained. In particular, the sample size reduction can save significant computational cost when using the quantized subset for feature approximation via the Nystrom method. Based on the quantized subset, the approximated features are incorporated into LS-SVM to develop a data density-dependent quantized LS-SVM (DQLS-SVM), where an analytic solution is obtained in the primal solution space. The developed DQLS-SVM is evaluated on synthetic and benchmark data with particular emphasis on large data sets. Extensive experimental results show that the learning machine incorporating DQS attains not only high computational efficiency but also good generalization performance.

Original languageEnglish
Pages (from-to)94-106
Number of pages13
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume28
Issue number1
DOIs
Publication statusPublished - 2017 Jan

Fingerprint

Support vector machines
Computational efficiency
Learning systems
Costs
Big data

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

@article{c1211c2a3b2d4ed297bc00701e8f68ed,
title = "Density-Dependent Quantized Least Squares Support Vector Machine for Large Data Sets",
abstract = "Based on the knowledge that input data distribution is important for learning, a data density-dependent quantization scheme (DQS) is proposed for sparse input data representation. The usefulness of the representation scheme is demonstrated by using it as a data preprocessing unit attached to the well-known least squares support vector machine (LS-SVM) for application on big data sets. Essentially, the proposed DQS adopts a single shrinkage threshold to obtain a simple quantization scheme, which adapts its outputs to input data density. With this quantization scheme, a large data set is quantized to a small subset where considerable sample size reduction is generally obtained. In particular, the sample size reduction can save significant computational cost when using the quantized subset for feature approximation via the Nystrom method. Based on the quantized subset, the approximated features are incorporated into LS-SVM to develop a data density-dependent quantized LS-SVM (DQLS-SVM), where an analytic solution is obtained in the primal solution space. The developed DQLS-SVM is evaluated on synthetic and benchmark data with particular emphasis on large data sets. Extensive experimental results show that the learning machine incorporating DQS attains not only high computational efficiency but also good generalization performance.",
author = "Shengyu Nan and Lei Sun and Badong Chen and Zhiping Lin and Toh, {Kar Ann}",
year = "2017",
month = "1",
doi = "10.1109/TNNLS.2015.2504382",
language = "English",
volume = "28",
pages = "94--106",
journal = "IEEE Transactions on Neural Networks and Learning Systems",
issn = "2162-237X",
publisher = "IEEE Computational Intelligence Society",
number = "1",

}

Density-Dependent Quantized Least Squares Support Vector Machine for Large Data Sets. / Nan, Shengyu; Sun, Lei; Chen, Badong; Lin, Zhiping; Toh, Kar Ann.

In: IEEE Transactions on Neural Networks and Learning Systems, Vol. 28, No. 1, 01.2017, p. 94-106.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Density-Dependent Quantized Least Squares Support Vector Machine for Large Data Sets

AU - Nan, Shengyu

AU - Sun, Lei

AU - Chen, Badong

AU - Lin, Zhiping

AU - Toh, Kar Ann

PY - 2017/1

Y1 - 2017/1

N2 - Based on the knowledge that input data distribution is important for learning, a data density-dependent quantization scheme (DQS) is proposed for sparse input data representation. The usefulness of the representation scheme is demonstrated by using it as a data preprocessing unit attached to the well-known least squares support vector machine (LS-SVM) for application on big data sets. Essentially, the proposed DQS adopts a single shrinkage threshold to obtain a simple quantization scheme, which adapts its outputs to input data density. With this quantization scheme, a large data set is quantized to a small subset where considerable sample size reduction is generally obtained. In particular, the sample size reduction can save significant computational cost when using the quantized subset for feature approximation via the Nystrom method. Based on the quantized subset, the approximated features are incorporated into LS-SVM to develop a data density-dependent quantized LS-SVM (DQLS-SVM), where an analytic solution is obtained in the primal solution space. The developed DQLS-SVM is evaluated on synthetic and benchmark data with particular emphasis on large data sets. Extensive experimental results show that the learning machine incorporating DQS attains not only high computational efficiency but also good generalization performance.

AB - Based on the knowledge that input data distribution is important for learning, a data density-dependent quantization scheme (DQS) is proposed for sparse input data representation. The usefulness of the representation scheme is demonstrated by using it as a data preprocessing unit attached to the well-known least squares support vector machine (LS-SVM) for application on big data sets. Essentially, the proposed DQS adopts a single shrinkage threshold to obtain a simple quantization scheme, which adapts its outputs to input data density. With this quantization scheme, a large data set is quantized to a small subset where considerable sample size reduction is generally obtained. In particular, the sample size reduction can save significant computational cost when using the quantized subset for feature approximation via the Nystrom method. Based on the quantized subset, the approximated features are incorporated into LS-SVM to develop a data density-dependent quantized LS-SVM (DQLS-SVM), where an analytic solution is obtained in the primal solution space. The developed DQLS-SVM is evaluated on synthetic and benchmark data with particular emphasis on large data sets. Extensive experimental results show that the learning machine incorporating DQS attains not only high computational efficiency but also good generalization performance.

UR - http://www.scopus.com/inward/record.url?scp=85027484971&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027484971&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2015.2504382

DO - 10.1109/TNNLS.2015.2504382

M3 - Article

C2 - 26685270

AN - SCOPUS:85027484971

VL - 28

SP - 94

EP - 106

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

IS - 1

ER -