On-device deep neural network (DNN) training holds the potential to enable a rich set of privacy-aware and infrastructure-independent personalized mobile applications. However, despite advancements in mobile hardware, locally training a complex DNN is still a nontrivial task given its resource demands. In this work, we show that the limited memory resources on mobile devices are the main constraint and propose Sage as a framework for efficiently optimizing memory resources for on-device DNN training. Specifically, Sage configures a flexible computation graph for DNN gradient evaluation and reduces the memory footprint of the graph using operator- and graph-level optimizations. In run-time, Sage employs a hybrid of gradient checkpointing and micro-batching techniques to dynamically adjust its memory use to the available system memory budget. Using implementation on off-the-shelf smartphones, we show that Sage enables local training of complex DNN models by reducing memory use by more than 20-fold compared to a baseline approach. We also show that Sage successfully adapts to run-time memory budget variations, and evaluate its energy consumption to show Sage's practical applicability.
|Title of host publication||MobiSys 2022 - Proceedings of the 2022 20th Annual International Conference on Mobile Systems, Applications and Services|
|Publisher||Association for Computing Machinery, Inc|
|Number of pages||13|
|Publication status||Published - 2022 Jun 27|
|Event||20th ACM International Conference on Mobile Systems, Applications and Services, MobiSys 2022 - Portland, United States|
Duration: 2022 Jun 27 → 2022 Jul 1
|Name||MobiSys 2022 - Proceedings of the 2022 20th Annual International Conference on Mobile Systems, Applications and Services|
|Conference||20th ACM International Conference on Mobile Systems, Applications and Services, MobiSys 2022|
|Period||22/6/27 → 22/7/1|
Bibliographical noteFunding Information:
The authors would like to thank our shepherd, Professor Mi Zhang, and the anonymous reviewers for their valuable feedback on the work. This work was supported by the Ministry of Science and ICT’s NRF Basic Science Research Program (2021R1A2C4002380), IITP (IITP-2022-2022-0-00240), ITRC Program supervised by IITP (IITP-2021-2020-0-01461), Ministry of Culture, Sports and Tourism and Korea Creative Content Agency (R2021040018), and by the Ministry of Trade, Industry and Energy and KIAT through the International Cooperative R&D program (P0016150). In Gim submitted this work as Hyunjun Kim. JeongGil Ko is the corresponding author for this work (email@example.com).
© 2022 ACM.
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Computer Science Applications