Deep learning using deep neural networks is taking machine intelligence to the next level in computer vision, speech recognition, natural language processing, etc. Brain-like hardware platforms for the brain-inspired computational models are being studied, but the maximum size of neural networks they can evaluate is often limited by the number of neurons and synapses equipped with the hardware. This paper presents two techniques, factorization and pruning, that not only compress the models but also maintain the form of the models for the execution on neuromorphic architectures. We also propose a novel method to combine the two techniques. The proposed method shows significant improvements in reducing the number of model parameters over standalone use of each method while maintaining the performance. Our experimental results show that the proposed method can achieve 30 × reduction rate within 1% budget of accuracy for the largest layer of AlexNet.
|Number of pages||11|
|Journal||IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems|
|Publication status||Published - 2019 Nov|
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design
- Electrical and Electronic Engineering