Efficient Data Supply for Parallel Heterogeneous Architectures
Author(s): Ham, Tae J; Aragón, Juan L; Martonosi, Margaret
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1s242
Abstract: | Decoupling techniques have been proposed to reduce the amount of memory latency exposed to high-performance accelerators as they fetch data. Although decoupled access-execute (DAE) and more recent decoupled data supply approaches offer promising single-threaded performance improvements, little work has considered how to extend them into parallel scenarios. This article explores the opportunities and challenges of designing parallel, high-performance, resource-efficient decoupled data supply systems. We propose Mercury, a parallel decoupled data supply system that utilizes thread-level parallelism for high-throughput data supply with good portability attributes. Additionally, we introduce some microarchitectural improvements for data supply units to efficiently handle long-latency indirect loads. |
Publication Date: | Apr-2019 |
Citation: | Ham, Tae Jun, Juan L. Aragón, and Margaret Martonosi. "Efficient Data Supply for Parallel Heterogeneous Architectures." ACM Transactions on Architecture and Code Optimization (TACO) 16, no. 2 (2019): 9:1-9:23. doi:10.1145/3310332 |
DOI: | 10.1145/3310332 |
ISSN: | 1544-3566 |
EISSN: | 1544-3973 |
Pages: | 9:1 - 9:23 |
Type of Material: | Journal Article |
Journal/Proceeding Title: | ACM Transactions on Architecture and Code Optimization |
Version: | Final published version. This is an open access article. |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.