Skip to main content

Automatically exploiting cross-invocation parallelism using runtime information

Author(s): Huang, Jialu; Jablin, Thomas B; Beard, Stephen R; Johnson, Nick P; August, David I

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1p812
Full metadata record
DC FieldValueLanguage
dc.contributor.authorHuang, Jialu-
dc.contributor.authorJablin, Thomas B-
dc.contributor.authorBeard, Stephen R-
dc.contributor.authorJohnson, Nick P-
dc.contributor.authorAugust, David I-
dc.date.accessioned2021-10-08T19:45:20Z-
dc.date.available2021-10-08T19:45:20Z-
dc.date.issued2013en_US
dc.identifier.citationHuang, Jialu, Thomas B. Jablin, Stephen R. Beard, Nick P. Johnson, and David I. August. "Automatically exploiting cross-invocation parallelism using runtime information." Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (2013): pp. 1-11. doi:10.1109/CGO.2013.6495001en_US
dc.identifier.issn2164-2397-
dc.identifier.urihttps://liberty.cs.princeton.edu/Publications/phdthesis_jialuh.pdf-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr1p812-
dc.description.abstractAutomatic parallelization is a promising approach to producing scalable multi-threaded programs for multicore architectures. Many existing automatic techniques only parallelize iterations within a loop invocation and synchronize threads at the end of each loop invocation. When parallel code contains many loop invocations, synchronization can easily become a performance bottleneck. Some automatic techniques address this problem by exploiting cross-invocation parallelism. These techniques use static analysis to partition iterations among threads to avoid crossthread dependences. However, this partitioning is not always achievable at compile-time, because program input determines dependence patterns at run-time. By contrast, this paper proposes DOMORE, the first automatic parallelization technique that uses runtime information to exploit additional cross-invocation parallelism. Instead of partitioning iterations statically, DOMORE dynamically detects crossthread dependences and synchronizes only when necessary. DOMORE consists of a compiler and a runtime library. At compile time, DOMORE automatically parallelizes loops and inserts a custom runtime engine into programs. At run-time, the engine observes dependences and synchronizes iterations only when necessary. For six programs, DOMORE achieves a geomean loop speedup of 2.1× over parallel execution without cross-invocation parallelization and of 3.2 × over sequential execution on eight cores.en_US
dc.format.extent1 - 11en_US
dc.language.isoen_USen_US
dc.relation.ispartofProceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)en_US
dc.rightsAuthor's manuscripten_US
dc.titleAutomatically exploiting cross-invocation parallelism using runtime informationen_US
dc.typeConference Articleen_US
dc.identifier.doi10.1109/CGO.2013.6495001-
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceedingen_US

Files in This Item:
File Description SizeFormat 
CrossInvocationParallelismRuntimeInformation.pdf2.17 MBAdobe PDFView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.