Distributed learning plays a key role in reducing the training time of modern deep neural networks with massive datasets. In this article, we consider a distributed learning problem where gradient computation is carried out over a number of computing devices at the wireless edge. We propose hierarchical broadcast coding, a provable coding-Theoretic framework to speed up distributed learning at the wireless edge. Our contributions are threefold. First, motivated by the hierarchical nature of real-world edge computing systems, we propose a layered code which mitigates the effects of not only packet losses at the wireless computing nodes but also straggling access points (APs) or small base stations. Second, by strategically allocating data partitions to nodes in the overlapping areas between cells, our technique achieves the fundamental lower bound on computational load to combat stragglers. Finally, we take advantage of the broadcast nature of wireless networks by which wireless devices in the overlapping cell coverage broadcast to more than one AP. This further reduces the overall training time in the presence of straggling APs. Experimental results on Amazon EC2 confirm the advantage of the proposed methods in speeding up learning. Our design targets any gradient descent based learning algorithms, including linear/logistic regressions and deep learning.
|Number of pages||16|
|Journal||IEEE Transactions on Wireless Communications|
|Publication status||Published - 2021 Apr|
Bibliographical noteFunding Information:
Manuscript received March 24, 2020; revised July 7, 2020 and October 5, 2020; accepted November 19, 2020. Date of publication December 7, 2020; date of current version April 9, 2021. This work was supported by the National Research Foundation of Korea under Grant 2019R1I1A2A02061135. The associate editor coordinating the review of this article and approving it for publication was M. Kaneko. (Corresponding author: Dong-Jun Han.) The authors are with the School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, South Korea (e-mail: email@example.com; firstname.lastname@example.org; email@example.com).
© 2002-2012 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Electrical and Electronic Engineering
- Applied Mathematics