Skip to main content

Efficient Data Supply for Parallel Heterogeneous Architectures

Author(s): Ham, Tae J; Aragón, Juan L; Martonosi, Margaret

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1s242
Abstract: Decoupling techniques have been proposed to reduce the amount of memory latency exposed to high-performance accelerators as they fetch data. Although decoupled access-execute (DAE) and more recent decoupled data supply approaches offer promising single-threaded performance improvements, little work has considered how to extend them into parallel scenarios. This article explores the opportunities and challenges of designing parallel, high-performance, resource-efficient decoupled data supply systems. We propose Mercury, a parallel decoupled data supply system that utilizes thread-level parallelism for high-throughput data supply with good portability attributes. Additionally, we introduce some microarchitectural improvements for data supply units to efficiently handle long-latency indirect loads.
Publication Date: Apr-2019
Citation: Ham, Tae Jun, Juan L. Aragón, and Margaret Martonosi. "Efficient Data Supply for Parallel Heterogeneous Architectures." ACM Transactions on Architecture and Code Optimization (TACO) 16, no. 2 (2019): 9:1-9:23. doi:10.1145/3310332
DOI: 10.1145/3310332
ISSN: 1544-3566
EISSN: 1544-3973
Pages: 9:1 - 9:23
Type of Material: Journal Article
Journal/Proceeding Title: ACM Transactions on Architecture and Code Optimization
Version: Final published version. This is an open access article.



Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.