Deep learning with GPUs

Won Jeon, Gun Ko, Jiwon Lee, Hyunwuk Lee, Dongho Ha, Won Woo Ro

Research output: Chapter in Book/Report/Conference proceedingChapter

4 Citations (Scopus)


Deep learning has been extensively researched in various areas and scales up very fast in the last decade. It has deeply permeated into our daily life, such as image classification, video synthesis, autonomous driving, voice recognition, and personalized recommendation systems. The main challenge for most deep learning models, including convolutional neural networks, recurrent neural networks, and recommendation models, is their large amount of computation. Fortunately, most computations in deep learning applications are parallelizable, therefore they can be effectively handled by throughput processors, such as Graphics Processing Units (GPUs). GPUs support high throughput, parallel processing performance, and high memory bandwidth and becomes the most popularly adopted device for deep learning. As a matter of fact, many deep learning workloads from mobile devices to data centers are performed by GPUs. In particular, modern GPU systems provide specialized hardware modules and software stacks for deep learning workloads. In this chapter, we present detailed analysis on the evolution of GPU architectures and the recent hardware and software supports for more efficient acceleration of deep learning in GPUs. Furthermore, we introduce leading-edge researches, challenges, and opportunities of running deep learning workloads on GPUs.

Original languageEnglish
Title of host publicationHardware Accelerator Systems for Artificial Intelligence and Machine Learning
EditorsShiho Kim, Ganesh Chandra Deka
PublisherAcademic Press Inc.
Number of pages49
ISBN (Print)9780128231234
Publication statusPublished - 2021 Jan

Publication series

NameAdvances in Computers
ISSN (Print)0065-2458

Bibliographical note

Funding Information:
This research was supported by the MOTIE (Ministry of Trade, Industry & Energy) (No. 10080674, Development of Reconfigurable Artificial Neural Network Accelerator and Instruction Set Architecture) and KSRC (Korea Semiconductor Research Consortium) support program for the development of the future semiconductor device and also supported by the Super Computer Development Leading Program of the National Research Foundation of Korea (NRF) funded by the Korean government (Ministry of Science and ICT (MSIT)) (NRF-2020M3H6A1084852).

Publisher Copyright:
© 2021 Elsevier Inc.

All Science Journal Classification (ASJC) codes

  • Computer Science(all)


Dive into the research topics of 'Deep learning with GPUs'. Together they form a unique fingerprint.

Cite this