AsmDB: understanding and mitigating front-end stalls in warehouse-scale computers
Author(s): Ayers, Grant; Nagendra, Nayana P; August, David I; Cho, Hyoun K; Kanev, Svilen; et al
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1xr6v
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ayers, Grant | - |
dc.contributor.author | Nagendra, Nayana P | - |
dc.contributor.author | August, David I | - |
dc.contributor.author | Cho, Hyoun K | - |
dc.contributor.author | Kanev, Svilen | - |
dc.contributor.author | Kozyrakis, Christos | - |
dc.contributor.author | Krishnamurthy, Trivikram | - |
dc.contributor.author | Litz, Heiner | - |
dc.contributor.author | Moseley, Tipp | - |
dc.contributor.author | Ranganathan, Parthasarathy | - |
dc.date.accessioned | 2021-10-08T19:45:19Z | - |
dc.date.available | 2021-10-08T19:45:19Z | - |
dc.date.issued | 2019-06 | en_US |
dc.identifier.citation | Ayers, Grant, Nayana Prasad Nagendra, David I. August, Hyoun Kyu Cho, Svilen Kanev, Christos Kozyrakis, Trivikram Krishnamurthy, Heiner Litz, Tipp Moseley, and Parthasarathy Ranganathan. "AsmDB: understanding and mitigating front-end stalls in warehouse-scale computers." Proceedings of the 46th International Symposium on Computer Architecture (2019): pp. 462-473. doi:10.1145/3307650.3322234 | en_US |
dc.identifier.issn | 1063-6897 | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/pr1xr6v | - |
dc.description.abstract | The large instruction working sets of private and public cloud workloads lead to frequent instruction cache misses and costs in the millions of dollars. While prior work has identified the growing importance of this problem, to date, there has been little analysis of where the misses come from, and what the opportunities are to improve them. To address this challenge, this paper makes three contributions. First, we present the design and deployment of a new, always-on, fleet-wide monitoring system, AsmDB, that tracks front-end bottlenecks. AsmDB uses hardware support to collect bursty execution traces, fleet-wide temporal and spatial sampling, and sophisticated offline post-processing to construct full-program dynamic control-flow graphs. Second, based on a longitudinal analysis of AsmDB data from real-world online services, we present two detailed insights on the sources of front-end stalls: (1) cold code that is brought in along with hot code leads to significant cache fragmentation and a corresponding large number of instruction cache misses; (2) distant branches and calls that are not amenable to traditional cache locality or next-line prefetching strategies account for a large fraction of cache misses. Third, we prototype two optimizations that target these insights. For misses caused by fragmentation, we focus on memcmp, one of the hottest functions contributing to cache misses, and show how fine-grained layout optimizations lead to significant benefits. For misses at the targets of distant jumps, we propose new hardware support for software code prefetching and prototype a new feedback-directed compiler optimization that combines static program flow analysis with dynamic miss profiles to demonstrate significant benefits for several large warehouse-scale workloads. Improving upon prior work, our proposal avoids invasive hardware modifications by prefetching via software in an efficient and scalable way. Simulation results show that such an approach can eliminate up to 96% of instruction cache misses with negligible overheads. | en_US |
dc.format.extent | 462 - 473 | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartof | Proceedings of the 46th International Symposium on Computer Architecture | en_US |
dc.rights | Final published version. This is an open access article. | en_US |
dc.title | AsmDB: understanding and mitigating front-end stalls in warehouse-scale computers | en_US |
dc.type | Conference Article | en_US |
dc.identifier.doi | 10.1145/3307650.3322234 | - |
dc.identifier.eissn | 2575-713X | - |
pu.type.symplectic | http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding | en_US |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FrontEndStallsWarehouseScaleComputers.pdf | 686.52 kB | Adobe PDF | View/Download |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.