Although approximate computing is widely used, it requires substantial programming effort to find appropriate approximation patterns among multiple pre-defined patterns to achieve a high performance. Therefore, we propose an automatic approximation framework called GATE to uncover hidden opportunities from any dataparallel program regardless of the code pattern or application characteristics using two compiler techniques, namely subgraph-level approximation (SGLA) and approximate thread merge (ATM). GATE also features conservative/aggressive tuning and dynamic calibration to maximize the performancewhile maintaining the TOQlevel during runtime. Our framework achieves an average performance gain of 2.54x over the baseline with minimum accuracy loss.
|Title of host publication||Proceedings of the 56th Annual Design Automation Conference 2019, DAC 2019|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Publication status||Published - 2019 Jun 2|
|Event||56th Annual Design Automation Conference, DAC 2019 - Las Vegas, United States|
Duration: 2019 Jun 2 → 2019 Jun 6
|Name||Proceedings - Design Automation Conference|
|Conference||56th Annual Design Automation Conference, DAC 2019|
|Period||19/6/2 → 19/6/6|
Bibliographical noteFunding Information:
However, despite the performance variance, GATE shows satisfactory speedups over most benchmarks. GATE can successfully tolerate architecture-specific features by using a generalized approach. GATE achieved a speedup of 2.95x on a CPU. 6 RELATED WORKS Approximate computing is a well-known technique in a wide range of applications, and automatic approximation with runtime quality monitoring has already been introduced in research areas [2, 15]. SAGE  is the closest related work to our work because it performs approximation on GPU systems. However, there are two noticeable differences between SAGE and GATE: 1) GATE attempts to maximize the coverage of approximation by using more generalized dataflow graph analysis, whereas most approaches including SAGE and Paraprox  depend on special patterns or hardware features; 2) we can measure a substantial amount of minimum performance gain by detecting input-insensitive opportunities. 7 CONCLUSION In this paper, we proposed a novel approximation framework called GATE with two general dataflow-level techniques i.e., SGLA and ATM. By replacing instructions based on subgraphs, GATE frame-work increases the approximation coverage and also realizes a higher performance gain over than conventional pattern-based approaches. In addition, GATE provides high performance using aggressive tuning and runtime calibration processes, while ensuring minimum performance gain using a conservative tuning process. On average, GATE shows an average speedup of 2.54x with less than 10% quality degradation for 19 widely used data-parallel workloads. ACKNOWLEDGMENTS This work was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP)(No. 2017R1A4A1015498), ICT R&D program of MSIP/IITP (No.2017-0-00142), and the R&D program of MOTIE/KEIT (No.10077609). Yongjun Park is the corresponding author.
© 2019 Association for Computing Machinery.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Control and Systems Engineering
- Electrical and Electronic Engineering
- Modelling and Simulation