Hardware Acceleration for DBMS Machine Learning Scoring: Is It Worth the Overheads?

Zahra Azad, Rathijit Sen, Kwanghyun Park, Ajay Joshi

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Query processing for data analytics with machine learning scoring involves executing heterogeneous operations in a pipelined fashion. Hardware acceleration is one approach to improve the pipeline performance and free up processor resources by offloading computations to the accelerators. However, the performance benefits of accelerators can be limited by the compute and data offloading overheads. Although prior works have studied acceleration opportunities, including with accelerators for machine learning operations, an end-to-end application performance analysis has not been well studied, particularly for data analytics and model scoring pipelines. In this paper, we study speedups and overheads of using PCIe-based hardware accelerators in such pipelines. In particular, we analyze the effectiveness of using GPUS and FPGAS to accelerate scoring for random forest, a popular machine learning model, on tabular input data obtained from Microsoft SQL Server. We observe that the offloading decision as well as the choice of the optimal hardware backend should depend at least on the model complexity (e.g., number of features and tree depth), the scoring data size, and the overheads associated with data movement and invocation of the pipeline stages. We also highlight potential future research explorations based on our findings.

    Original languageEnglish
    Title of host publicationProceedings - 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages243-253
    Number of pages11
    ISBN (Electronic)9781728186436
    DOIs
    Publication statusPublished - 2021 Mar
    Event2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021 - Virtual, Stony Brook, United States
    Duration: 2021 Mar 282021 Mar 30

    Publication series

    NameProceedings - 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021

    Conference

    Conference2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021
    Country/TerritoryUnited States
    CityVirtual, Stony Brook
    Period21/3/2821/3/30

    Bibliographical note

    Publisher Copyright:
    © 2021 IEEE.

    All Science Journal Classification (ASJC) codes

    • Hardware and Architecture
    • Information Systems
    • Software
    • Safety, Risk, Reliability and Quality
    • Artificial Intelligence
    • Computer Science Applications

    Fingerprint

    Dive into the research topics of 'Hardware Acceleration for DBMS Machine Learning Scoring: Is It Worth the Overheads?'. Together they form a unique fingerprint.

    Cite this