Graph convolutional networks (GCNs) are becoming increasingly popular as they can process a wide variety of data formats that prior deep neural networks cannot easily support. One key challenge in designing hardware accelerators for GCNs is the vast size and randomness in their data access patterns which greatly reduces the effectiveness of the limited on-chip cache. Aimed at improving the effectiveness of the cache by mitigating the irregular data accesses, prior studies often employ the vertex tiling techniques used in traditional graph processing applications. While being effective at enhancing the cache efciency, those approaches are often sensitive to the tiling confgurations where the optimal setting heavily depends on target input datasets. Furthermore, the existing solutions require manual tuning through trial-and-error or rely on sub-optimal analytical models. In this paper, we propose Slice-and-Forge (SnF), an efcient hardware accelerator for GCNs which greatly improves the effectiveness of the limited on-chip cache. SnF chooses a tiling strategy named feature slicing that splits the features into vertical slices and processes them in the outermost loop of the execution. This particular choice results in a repetition of the identical computational patterns over irregular graph data over multiple rounds. Taking advantage of such repetitions, SnF dynamically tunes its tile size. Our experimental results reveal that SnF can achieve 1.73× higher performance in geomean compared to prior work on multi-engine settings, and 1.46× higher performance in geomean on small scale settings, without the need for off-line analyses.
|Title of host publication||PACT 2022 - Proceedings of the 2022 International Conference on Parallel Architectures and Compilation Techniques|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||14|
|Publication status||Published - 2022 Oct 8|
|Event||31st International Conference on Parallel Architectures and Compilation Techniques, PACT 2022 - Chicago, United States|
Duration: 2022 Oct 8 → 2022 Oct 10
|Name||Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT|
|Conference||31st International Conference on Parallel Architectures and Compilation Techniques, PACT 2022|
|Period||22/10/8 → 22/10/10|
Bibliographical noteFunding Information:
This work was partly supported by the National Research Foundation of Korea (NRF) grants (2022R1C1C1011307, 2022R1C1C1008131) and Institute of Information & communications Technology Planning & Evaluation (IITP) grants (2021-0-00853, 2020-0-01361) funded by the Korea government (MSIT). The EDA tool was supported by the IC Design Education Center (IDEC), Korea. Mingi Yoo, Jaeyong Song, Hyeyoon Lee, Jounghoo Lee, and Youngsok Kim are with the Department of Computer Science at Yonsei University and have been partly supported by the BK21 FOUR (Fostering Outstanding Universities for Research) funded by the Ministry of Education (MOE, Korea) and National Research Foundation of Korea (NRF).
© 2022 Association for Computing Machinery.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Hardware and Architecture