Structured Sparsification with Joint Optimization of Group Convolution and Channel Shuffle

Xin Yu Zhang, Kai Zhao, Taihong Xiao, Ming Ming Cheng, Ming Hsuan Yang

Research output: Contribution to conferencePaperpeer-review

1 Citation (Scopus)

Abstract

Recent advances in convolutional neural networks (CNNs) usually come with the expense of excessive computational overhead and memory footprint. Network compression aims to alleviate this issue by training compact models with comparable performance. However, existing compression techniques either entail dedicated expert design or compromise with a moderate performance drop. In this paper, we propose a novel structured sparsification method for efficient network compression. The proposed method automatically induces structured sparsity on the convolutional weights, thereby facilitating the implementation of the compressed model with the highly-optimized group convolution. We further address the problem of inter-group communication with a learnable channel shuffle mechanism. The proposed approach can be easily applied to compress many network architectures with a negligible performance drop. Extensive experimental results and analysis demonstrate that our approach gives a competitive performance against the recent network compression counterparts with a sound accuracy-complexity trade-off.

Original languageEnglish
Pages440-450
Number of pages11
Publication statusPublished - 2021
Event37th Conference on Uncertainty in Artificial Intelligence, UAI 2021 - Virtual, Online
Duration: 2021 Jul 272021 Jul 30

Conference

Conference37th Conference on Uncertainty in Artificial Intelligence, UAI 2021
CityVirtual, Online
Period21/7/2721/7/30

Bibliographical note

Publisher Copyright:
© 2021 37th Conference on Uncertainty in Artificial Intelligence, UAI 2021. All Rights Reserved.

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Structured Sparsification with Joint Optimization of Group Convolution and Channel Shuffle'. Together they form a unique fingerprint.

Cite this