Efficient cache tag management is a primary design objective for large, in-package DRAM caches. Recently, Tagless DRAM Caches (TDCs) have been proposed to completely eliminate tagging structures from both on-die SRAM and in-package DRAM, which are a major scalability bottleneck for future multi-gigabyte DRAM caches. However, TDC imposes a constraint on DRAM cache block size to be the same as OS page size (e.g., 4KB) as it takes a unified approach to address translation and cache tag management. Caching at a page granularity, or page-based caching, incurs significant off-package DRAM bandwidth waste by over-fetching blocks within a page that are not actually used. Footprint caching is an effective solution to this problem, which fetches only those blocks that will likely be touched during the page's lifetime in the DRAM cache, referred to as the page's footprint. In this paper we demonstrate TDC opens up unique opportunities to realize efficient footprint caching with higher prediction accuracy and a lower hardware cost than the original footprint caching scheme. Since there are no cache tags in TDC, the footprints of cached pages are tracked at TLB, instead of cache tag array, to incur much lower on-die storage overhead than the original design. Besides, when a cached page is evicted, its footprint will be stored in the corresponding page table entry, instead of an auxiliary on-die structure (i.e., Footprint History Table), to prevent footprint thrashing among different pages, thus yielding higher accuracy in footprint prediction. The resulting design, called Footprint-augmented Tagless DRAM Cache (F-TDC), significantly improves the bandwidth efficiency of TDC, and hence its performance and energy efficiency. Our evaluation with 3D Through-Silicon-Via-based in-package DRAM demonstrates an average reduction of off-package bandwidth by 32.0%, which, in turn, improves IPC and EDP by 17.7% and 25.4%, respectively, over the state-of-the-art TDC with no footprint caching.
|Title of host publication||Proceedings of the 2016 IEEE International Symposium on High-Performance Computer Architecture, HPCA 2016|
|Publisher||IEEE Computer Society|
|Number of pages||12|
|Publication status||Published - 2016 Apr 1|
|Event||22nd IEEE International Symposium on High Performance Computer Architecture, HPCA 2016 - Barcelona, Spain|
Duration: 2016 Mar 12 → 2016 Mar 16
|Name||Proceedings - International Symposium on High-Performance Computer Architecture|
|Other||22nd IEEE International Symposium on High Performance Computer Architecture, HPCA 2016|
|Period||16/3/12 → 16/3/16|
Bibliographical noteFunding Information:
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2014R1A1A1005894 and NRF-2014R1A1A1003746) and the Ministry of Education (NRF-2014R1A1A2054658).
© 2016 IEEE.
All Science Journal Classification (ASJC) codes
- Hardware and Architecture