Abstract
Query processing for data analytics with machine learning scoring involves executing heterogeneous operations in a pipelined fashion. Hardware acceleration is one approach to improve the pipeline performance and free up processor resources by offloading computations to the accelerators. However, the performance benefits of accelerators can be limited by the compute and data offloading overheads. Although prior works have studied acceleration opportunities, including with accelerators for machine learning operations, an end-to-end application performance analysis has not been well studied, particularly for data analytics and model scoring pipelines. In this paper, we study speedups and overheads of using PCIe-based hardware accelerators in such pipelines. In particular, we analyze the effectiveness of using GPUS and FPGAS to accelerate scoring for random forest, a popular machine learning model, on tabular input data obtained from Microsoft SQL Server. We observe that the offloading decision as well as the choice of the optimal hardware backend should depend at least on the model complexity (e.g., number of features and tree depth), the scoring data size, and the overheads associated with data movement and invocation of the pipeline stages. We also highlight potential future research explorations based on our findings.
Original language | English |
---|---|
Title of host publication | Proceedings - 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 243-253 |
Number of pages | 11 |
ISBN (Electronic) | 9781728186436 |
DOIs | |
Publication status | Published - 2021 Mar |
Event | 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021 - Virtual, Stony Brook, United States Duration: 2021 Mar 28 → 2021 Mar 30 |
Publication series
Name | Proceedings - 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021 |
---|
Conference
Conference | 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021 |
---|---|
Country/Territory | United States |
City | Virtual, Stony Brook |
Period | 21/3/28 → 21/3/30 |
Bibliographical note
Publisher Copyright:© 2021 IEEE.
All Science Journal Classification (ASJC) codes
- Hardware and Architecture
- Information Systems
- Software
- Safety, Risk, Reliability and Quality
- Artificial Intelligence
- Computer Science Applications