Google workloads for consumer devices: Mitigating data movement bottlenecks

Amirali Boroumand, Saugata Ghose, Youngsok Kim, Rachata Ausavarungnirun, Eric Shiu, Rahul Thakur, Daehyun Kim, Aki Kuusela, Allan Knies, Parthasarathy Ranganathan, Onur Mutlu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

42 Citations (Scopus)

Abstract

We are experiencing an explosive growth in the number of consumer devices, including smartphones, tablets, web-based computers such as Chromebooks, and wearable devices. For this class of devices, energy efficiency is a first-class concern due to the limited battery capacity and thermal power budget. We find that data movement is a major contributor to the total system energy and execution time in consumer devices. The energy and performance costs of moving data between the memory system and the compute units are significantly higher than the costs of computation. As a result, addressing data movement is crucial for consumer devices. In this work, we comprehensively analyze the energy and performance impact of data movement for several widely-used Google consumer workloads: (1) the Chrome web browser; (2) TensorFlow Mobile, Google's machine learning framework; (3) video playback, and (4) video capture, both of which are used in many video services such as YouTube and Google Hangouts. We find that processing-inmemory (PIM) can significantly reduce data movement for all of these workloads, by performing part of the computation close to memory. Each workload contains simple primitives and functions that contribute to a significant amount of the overall data movement. We investigate whether these primitives and functions are feasible to implement using PIM, given the limited area and power constraints of consumer devices. Our analysis shows that offloading these primitives to PIM logic, consisting of either simple cores or specialized accelerators, eliminates a large amount of data movement, and significantly reduces total system energy (by an average of 55.4% across the workloads) and execution time (by an average of 54.2%).

Original languageEnglish
Title of host publicationASPLOS 2018 - 23rd International Conference on Architectural Support for Programming Languages and Operating Systems
PublisherAssociation for Computing Machinery
Pages316-331
Number of pages16
ISBN (Electronic)9781450349116
DOIs
Publication statusPublished - 2018 Mar 19
Event23rd International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018 - Williamsburg, United States
Duration: 2018 Mar 242018 Mar 28

Publication series

NameInternational Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS

Conference

Conference23rd International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018
Country/TerritoryUnited States
CityWilliamsburg
Period18/3/2418/3/28

Bibliographical note

Funding Information:
This research started at Google, during Amirali Boroumand’s extended internship and Onur Mutlu’s stay as a visiting researcher. We thank our industrial partners, especially Google, Huawei, Intel, and VMware, for their generous support. This research was partially supported by the Semiconductor Research Corporation.

Publisher Copyright:
© 2018 Copyright held by the owner/author(s).

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Google workloads for consumer devices: Mitigating data movement bottlenecks'. Together they form a unique fingerprint.

Cite this