Abstract
As the effectiveness of Deep Neural Networks (DNNs) is rising over time, so is the need for highly scalable and efficient hardware architectures to capitalize this effectiveness in many practical applications. Emerging non-volatile Phase Change Memory (PCM) technology has been found to be a promising candidate for future memory systems due to its better scalability, non-volatility and low leakage/dynamic power consumption, compared to conventional charged-based memories. Additionally, with its cell s wide resistance span, PCM also has the Flash-like Multi-Level Cell (MLC) capability, which has enhanced storage density, providing an opportunity for the deployment of data-intensive applications such as DNNs on resource-constrained edge devices. However, the practical deployment of MLC PCM is hampered by certain reliability challenges, among which, the resistance drift is considered to be a critical concern. In a DNN application, the presence of resistance drift in MLC PCM can cause a severe impact to DNN s accuracy if no drift-error-tolerance technique is utilized. This paper proposes DynaPAT, a low-cost and effective pattern-aware encoding technique to enhance the drift-error-tolerance of MLC PCM-based Deep Neural Networks. DynaPAT has been constructed on the insight into DNN s vulnerability against different data pattern switching. Based on this insight, DynaPAT efficiently maps the most-frequent data pattern in DNN s parameters to the least-drift-prone level of the MLC PCM, thus significantly enhancing the robustness of the system against drift errors. Various experiments on different DNN models and configurations demonstrate the effectiveness of DynaPAT. The experimental results indicate that DynaPAT can achieve up to 500× enhancement in the drift-errors-tolerance capability over the baseline MLC PCM based DNN while requiring only a negligible hardware overhead (below 1% storage overhead). Being orthogonal, DynaPAT can be integrated with existing drifttolerance schemes for even higher gains in reliability.
Original language | English |
---|---|
Title of host publication | Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2022 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781450392174 |
DOIs | |
Publication status | Published - 2022 Oct 30 |
Event | 41st IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2022 - San Diego, United States Duration: 2022 Oct 30 → 2022 Nov 4 |
Publication series
Name | IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD |
---|---|
ISSN (Print) | 1092-3152 |
Conference
Conference | 41st IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2022 |
---|---|
Country/Territory | United States |
City | San Diego |
Period | 22/10/30 → 22/11/4 |
Bibliographical note
Funding Information:This work was supported in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Korea Government (MSIT) under Grant 2022-0-00971; in part by the Next Generation Intelligent Semiconductor Development by the Ministry of Trade, Industry and Energy (MOTIE) under Grant 20011074; and in part by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education under Grant NRF-2020M3F3A2A01082326
Funding Information:
The performance of DynaPAT compared with other methods when evaluated on ImageNet dataset using ResNet-50, DensenNet-169 and Inception-v3, is shown in Fig. 12. Note that ImageNet is a more difficult challenge compared with the CIFAR datasets since it contains 1000 classes as opposed to 10/100 classes in CIFAR datasets. Thus, the performance of the DynaPAT as well as other fault-tolerance techniques is expected to be lower than in case of CIFAR-10. Additionally, in our evaluation, the models evaluated on ImageNet dataset also have more layers (deeper) than the models evaluated on CIFAR datasets, hence, the fault-tolerance methods tend to be less effective. This observation is consistent with previous works [7, 18] that evaluate DNNs considering the presence of hardware faults. Nevertheless, DynaPAT still shows a significant robustness improvement over the baseline as well as other techniques. For example, for 8-bit quantized DenseNet-169 network, DynaPAT enhances the robustness over the baseline and Flipcy by 32× and 8×, respectively. Being orthogonal, if combined with any encoding scheme to reduce the frequency of error-prone patterns at cost of additional storage, DynaPAT can perform even better. 5 Conclusion Resistance drift severely impacts the accuracy of MLC PCM-based DNN systems. This paper analyzes the DNN’s behavior when different data pattern switching due to resistance drift errors are considered. Based on this analysis, this paper presents DynaPAT, a dynamic pattern-aware encoding technique for robust MLC PCM-based Deep Neural Networks. DynaPAT is lightweight yet effective in reducing the impact of drift errors in MLC PCM-based DNN systems. By efficiently mapping the most-frequent patterns to the least-error-prone levels of MLC PCM in the order of the criticality in DNNs, DynaPAT is able to significantly enhance the robustness of the system by orders of magnitude. DynaPAT is evaluated and compared with other relevant state-of-the-art encoding techniques on DNN models. It is found that DynaPAT can achieve up to 500 × robustness over the baseline models (without any fault-tolerance technique). The results are consistent even when using different DNN architectures, configurations and datasets. DynaPAT adds negligible area, storage and latency overhead compared to the existing techniques, while still being more effective in tolerating drift errors. Lastly, being orthogonal to other encoding methods, DynaPAT can be integrated with them to further enhance the robustness of MLC PCM-based DNN systems. Acknowledgments This work was supported in part by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea Government (MSIT) under Grant 2022-0-00971; in part by the Next Generation Intelligent Semiconductor Development by the Ministry of Trade, Industry and Energy (MOTIE) under Grant 20011074; and in part by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education under Grant NRF-2020M3F3A2A01082326.
Publisher Copyright:
© 2022 Association for Computing Machinery.
All Science Journal Classification (ASJC) codes
- Software
- Computer Science Applications
- Computer Graphics and Computer-Aided Design