Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs

Jounghoo Lee, Jinwoo Choi, Jaeyeon Kim, Jinho Lee, Youngsok Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

We present dataflow mirroring, architectural support for low-overhead fine-grained systolic array allocation which overcomes the limitations of prior coarse-grained spatial-multitasking Neural Processing Unit (NPU) architectures. The key idea of dataflow mirroring is to reverse the dataflows of co-located Neural Networks (NNs) in horizontal and/or vertical directions, allowing allocation boundaries to be set between any adjacent rows and columns of a systolic array and supporting up to four-way spatial multitasking. Our detailed experiments using MLPerf NNs and a dataflow-mirroring-augmented NPU prototype which extends Google's TPU with dataflow mirroring shows that dataflow mirroring can significantly improve the multitasking performance by up to 46.4%.

Original languageEnglish
Title of host publication2021 58th ACM/IEEE Design Automation Conference, DAC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages247-252
Number of pages6
ISBN (Electronic)9781665432740
DOIs
Publication statusPublished - 2021 Dec 5
Event58th ACM/IEEE Design Automation Conference, DAC 2021 - San Francisco, United States
Duration: 2021 Dec 52021 Dec 9

Publication series

NameProceedings - Design Automation Conference
Volume2021-December
ISSN (Print)0738-100X

Conference

Conference58th ACM/IEEE Design Automation Conference, DAC 2021
Country/TerritoryUnited States
CitySan Francisco
Period21/12/521/12/9

Bibliographical note

Funding Information:
We proposed dataflow mirroring, lightweight architectural support for fine-grained systolic array allocation. By reversing the dataflows of co-located NNs, dataflow mirroring allows allocation boundaries to be set between any adjacent PE rows and columns. Then, we designed FGSpMt-NPU, a highly efficient spatial-multitasking NPU architecture which implements dataflow mirroring to achieve higher hardware utilization and performance over the existing coarse-grained spatial-multitasking NPU architecture. By enabling fine-grained distribution of the systolic array to co-located NNs, FGSpMt-NPU can greatly improve the multitasking performance over the state-of-the-art. ACKNOWLEDGEMENTS This work was supported by the National Research Foundation of Korea (NRF) grant (No. 2020R1F1A1069742) and Institute of Information & Communications Technology Planning & Evaluation (IITP) grant (No. 2020-0-01361, Artificial Intelligence Graduate School Program(Yonsei University)) funded by the Korea government (MSIT), and the Yonsei University Research Fund (2020-22-0511, 2021-22-0001). Youngsok Kim is the corresponding author of this paper.

Publisher Copyright:
© 2021 IEEE.

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs'. Together they form a unique fingerprint.

Cite this