Serializing Instructions in System-Intensive Workloads: Amdahls Law Strikes Again
Loading...
Files
Date
Authors
Wells, Philip
Sohi, Gurindar
Advisors
License
DOI
Type
Technical Report
Journal Title
Journal ISSN
Volume Title
Publisher
University of Wisconsin-Madison Department of Computer Sciences
Grantor
Abstract
To maintain a reasonable level of complexity, processor implementations contain Serializing Instructions (SIs) � instructions, such as those that write control registers, that cannot be executed out-of-order (OoO). Maintaining sequential semantics may force SIs to serialize the pipeline and execute as the only instruction in the window.
We examine the frequency of SIs in three ISAs, SPARC V9, X86-64, and PowerPC, for several system-intensive workloads. Across ISAs, we observe 2�8 SIs per thousand instructions for most workloads. As explained by Amdahl�s Law, such frequent SIs, which create serial regions within the instruction-level parallel execution of a single thread, can have a significant impact on performance. For the SPARC ISA (after removing TLB and register window effects), we observe a 4�17% performance difference between a modest out-of-order processor and a hypothetical processor which idealizes serializing instructions.
We examine the consumption of values produced by several SIs, and observe that most values are consumed, but that the values are Effectively Useless (EU) � i.e. they do not actually change the execution of the consuming instructions. To improve the performance of such SIs, we propose EU prediction, which can allow younger instructions to proceed, possibly reading a stale value, and yet still correctly execute. This simple technique improves the performance of five of our seven workloads by 8�12%.
Description
Keywords
Related Material and Data
Citation
TR1606