Parallel Coordinate Descent Newton Method for Efficient L1-Regularized Loss Minimization

Yatao An Bian, Xiong Li, Yuncai Liu, Ming Hsuan Yang

Research output: Contribution to journalArticle

Abstract

The recent years have witnessed advances in parallel algorithms for large-scale optimization problems. Notwithstanding the demonstrated success, existing algorithms that parallelize over features are usually limited by divergence issues under high parallelism or require data preprocessing to alleviate these problems. In this paper, we propose a Parallel Coordinate Descent algorithm using approximate Newton steps (PCDN) that is guaranteed to converge globally without data preprocessing. The key component of the PCDN algorithm is the high-dimensional line search, which guarantees the global convergence with high parallelism. The PCDN algorithm randomly partitions the feature set into b subsets/bundles of size P, and sequentially processes each bundle by first computing the descent directions for each feature in parallel and then conducting P-dimensional line search to compute the step size. We show that: 1) the PCDN algorithm is guaranteed to converge globally despite increasing parallelism and 2) the PCDN algorithm converges to the specified accuracy ϵ within the limited iteration number of Tϵ, and Tϵ decreases with increasing parallelism. In addition, the data transfer and synchronization cost of the P-dimensional line search can be minimized by maintaining intermediate quantities. For concreteness, the proposed PCDN algorithm is applied to L1-regularized logistic regression and L1-regularized L2-loss support vector machine problems. Experimental evaluations on seven benchmark data sets show that the PCDN algorithm exploits parallelism well and outperforms the state-of-the-art methods.

Original languageEnglish
Article number8661743
Pages (from-to)3233-3245
Number of pages13
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume30
Issue number11
DOIs
Publication statusPublished - 2019 Nov

Fingerprint

Newton-Raphson method
Data transfer
Set theory
Parallel algorithms
Support vector machines
Logistics
Synchronization
Costs

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

@article{3aa757644fb143b4b956dc01bc56fba5,
title = "Parallel Coordinate Descent Newton Method for Efficient L1-Regularized Loss Minimization",
abstract = "The recent years have witnessed advances in parallel algorithms for large-scale optimization problems. Notwithstanding the demonstrated success, existing algorithms that parallelize over features are usually limited by divergence issues under high parallelism or require data preprocessing to alleviate these problems. In this paper, we propose a Parallel Coordinate Descent algorithm using approximate Newton steps (PCDN) that is guaranteed to converge globally without data preprocessing. The key component of the PCDN algorithm is the high-dimensional line search, which guarantees the global convergence with high parallelism. The PCDN algorithm randomly partitions the feature set into b subsets/bundles of size P, and sequentially processes each bundle by first computing the descent directions for each feature in parallel and then conducting P-dimensional line search to compute the step size. We show that: 1) the PCDN algorithm is guaranteed to converge globally despite increasing parallelism and 2) the PCDN algorithm converges to the specified accuracy ϵ within the limited iteration number of Tϵ, and Tϵ decreases with increasing parallelism. In addition, the data transfer and synchronization cost of the P-dimensional line search can be minimized by maintaining intermediate quantities. For concreteness, the proposed PCDN algorithm is applied to L1-regularized logistic regression and L1-regularized L2-loss support vector machine problems. Experimental evaluations on seven benchmark data sets show that the PCDN algorithm exploits parallelism well and outperforms the state-of-the-art methods.",
author = "Bian, {Yatao An} and Xiong Li and Yuncai Liu and Yang, {Ming Hsuan}",
year = "2019",
month = "11",
doi = "10.1109/TNNLS.2018.2889976",
language = "English",
volume = "30",
pages = "3233--3245",
journal = "IEEE Transactions on Neural Networks and Learning Systems",
issn = "2162-237X",
publisher = "IEEE Computational Intelligence Society",
number = "11",

}

Parallel Coordinate Descent Newton Method for Efficient L1-Regularized Loss Minimization. / Bian, Yatao An; Li, Xiong; Liu, Yuncai; Yang, Ming Hsuan.

In: IEEE Transactions on Neural Networks and Learning Systems, Vol. 30, No. 11, 8661743, 11.2019, p. 3233-3245.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Parallel Coordinate Descent Newton Method for Efficient L1-Regularized Loss Minimization

AU - Bian, Yatao An

AU - Li, Xiong

AU - Liu, Yuncai

AU - Yang, Ming Hsuan

PY - 2019/11

Y1 - 2019/11

N2 - The recent years have witnessed advances in parallel algorithms for large-scale optimization problems. Notwithstanding the demonstrated success, existing algorithms that parallelize over features are usually limited by divergence issues under high parallelism or require data preprocessing to alleviate these problems. In this paper, we propose a Parallel Coordinate Descent algorithm using approximate Newton steps (PCDN) that is guaranteed to converge globally without data preprocessing. The key component of the PCDN algorithm is the high-dimensional line search, which guarantees the global convergence with high parallelism. The PCDN algorithm randomly partitions the feature set into b subsets/bundles of size P, and sequentially processes each bundle by first computing the descent directions for each feature in parallel and then conducting P-dimensional line search to compute the step size. We show that: 1) the PCDN algorithm is guaranteed to converge globally despite increasing parallelism and 2) the PCDN algorithm converges to the specified accuracy ϵ within the limited iteration number of Tϵ, and Tϵ decreases with increasing parallelism. In addition, the data transfer and synchronization cost of the P-dimensional line search can be minimized by maintaining intermediate quantities. For concreteness, the proposed PCDN algorithm is applied to L1-regularized logistic regression and L1-regularized L2-loss support vector machine problems. Experimental evaluations on seven benchmark data sets show that the PCDN algorithm exploits parallelism well and outperforms the state-of-the-art methods.

AB - The recent years have witnessed advances in parallel algorithms for large-scale optimization problems. Notwithstanding the demonstrated success, existing algorithms that parallelize over features are usually limited by divergence issues under high parallelism or require data preprocessing to alleviate these problems. In this paper, we propose a Parallel Coordinate Descent algorithm using approximate Newton steps (PCDN) that is guaranteed to converge globally without data preprocessing. The key component of the PCDN algorithm is the high-dimensional line search, which guarantees the global convergence with high parallelism. The PCDN algorithm randomly partitions the feature set into b subsets/bundles of size P, and sequentially processes each bundle by first computing the descent directions for each feature in parallel and then conducting P-dimensional line search to compute the step size. We show that: 1) the PCDN algorithm is guaranteed to converge globally despite increasing parallelism and 2) the PCDN algorithm converges to the specified accuracy ϵ within the limited iteration number of Tϵ, and Tϵ decreases with increasing parallelism. In addition, the data transfer and synchronization cost of the P-dimensional line search can be minimized by maintaining intermediate quantities. For concreteness, the proposed PCDN algorithm is applied to L1-regularized logistic regression and L1-regularized L2-loss support vector machine problems. Experimental evaluations on seven benchmark data sets show that the PCDN algorithm exploits parallelism well and outperforms the state-of-the-art methods.

UR - http://www.scopus.com/inward/record.url?scp=85074320534&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074320534&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2018.2889976

DO - 10.1109/TNNLS.2018.2889976

M3 - Article

C2 - 30843852

AN - SCOPUS:85074320534

VL - 30

SP - 3233

EP - 3245

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

IS - 11

M1 - 8661743

ER -