Deep learning using deep neural networks is taking machine intelligence to the next level in computer vision, speech recognition, natural language processing, etc. Brain-like hardware platforms for the brain-inspired computational models are being studied, but the maximum size of neural networks they can evaluate is often limited by the number of neurons and synapses equipped with the hardware. This paper presents two techniques, factorization and pruning, that not only compress the models but also maintain the form of the models for the execution on neuromorphic architectures. We also propose a novel method to combine the two techniques. The proposed method shows significant improvements in reducing the number of model parameters over standalone use of each method while maintaining the performance. Our experimental results show that the proposed method can achieve 30 × reduction rate within 1% budget of accuracy for the largest layer of AlexNet.
|Number of pages||11|
|Journal||IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems|
|Publication status||Published - 2019 Nov|
Bibliographical noteFunding Information:
Manuscript received January 30, 2018; revised April 18, 2018; accepted May 31, 2018. Date of publication October 19, 2018; date of current version October 16, 2019. This work was supported in part by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education under Grant NRF-2017R1D1A1B03029103, and in part by the Institute for Information and Communications Technology Promotion funded by the Korea Government under Grant 1711073912. Preliminary results of this study were presented at DAC 2016. This paper was recommended by Associate Editor Y. Wang.
ACKNOWLEDGMENT The authors would like to thank EDA tools used in this paper were supported by IDEC.
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design
- Electrical and Electronic Engineering