Runtime asynchronous fault tolerance via speculation
Author(s): Zhang, Yun; Ghosh, Soumyadeep; Huang, Jialu; Lee, Jae W; Mahlke, Scott A; et al
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1dr6p
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zhang, Yun | - |
dc.contributor.author | Ghosh, Soumyadeep | - |
dc.contributor.author | Huang, Jialu | - |
dc.contributor.author | Lee, Jae W | - |
dc.contributor.author | Mahlke, Scott A | - |
dc.contributor.author | August, David I | - |
dc.date.accessioned | 2021-10-08T19:45:21Z | - |
dc.date.available | 2021-10-08T19:45:21Z | - |
dc.date.issued | 2012 | en_US |
dc.identifier.citation | Zhang, Yun, Soumyadeep Ghosh, Jialu Huang, Jae W. Lee, Scott A. Mahlke, and David I. August. "Runtime asynchronous fault tolerance via speculation." Proceedings of the Tenth International Symposium on Code Generation and Optimization (2012): pp. 145-154. doi:10.1145/2259016.2259035 | en_US |
dc.identifier.issn | 2164-2397 | - |
dc.identifier.uri | https://liberty.princeton.edu/Publications/cgo12_raft.pdf | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/pr1dr6p | - |
dc.description.abstract | Transient faults are emerging as a critical reliability concern in modern microprocessors. Redundant hardware solutions are commonly deployed to detect transient faults, but they are less flexible and cost-effective than software solutions. However, software solutions are rendered impractical because of high performance overheads. To address this problem, this paper presents Runtime Asynchronous Fault Tolerance via Speculation (RAFT), the fastest transient fault detection technique known to date. Serving as a layer between the application and the underlying platform, RAFT automatically generates two symmetric program instances from a program binary. It detects transient faults in a non-invasive way and exploits high-confidence value speculation to achieve low runtime overhead. Evaluation on a commodity multicore system demonstrates that RAFT delivers a geomean performance overhead of 2.83% on a set of 30 SPEC CPU benchmarks and STAMP benchmarks. Compared with existing transient fault detection techniques, RAFT exhibits the best performance and fault coverage, without requiring any change to the hardware or the software applications. | en_US |
dc.format.extent | 145 - 154 | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartof | Proceedings of the Tenth International Symposium on Code Generation and Optimization | en_US |
dc.rights | Author's manuscript | en_US |
dc.title | Runtime asynchronous fault tolerance via speculation | en_US |
dc.type | Conference Article | en_US |
dc.identifier.doi | 10.1145/2259016.2259035 | - |
pu.type.symplectic | http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding | en_US |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
RuntimeAsynchronousFaultToleranceSpeculation.pdf | 303.12 kB | Adobe PDF | View/Download |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.