Serializing Instructions in System-Intensive Workloads: Amdahls Law Strikes Again

Loading...
Thumbnail Image

Date

Authors

Wells, Philip
Sohi, Gurindar

Advisors

License

DOI

Type

Technical Report

Journal Title

Journal ISSN

Volume Title

Publisher

University of Wisconsin-Madison Department of Computer Sciences

Grantor

Abstract

To maintain a reasonable level of complexity, processor implementations contain Serializing Instructions (SIs) � instructions, such as those that write control registers, that cannot be executed out-of-order (OoO). Maintaining sequential semantics may force SIs to serialize the pipeline and execute as the only instruction in the window. We examine the frequency of SIs in three ISAs, SPARC V9, X86-64, and PowerPC, for several system-intensive workloads. Across ISAs, we observe 2�8 SIs per thousand instructions for most workloads. As explained by Amdahl�s Law, such frequent SIs, which create serial regions within the instruction-level parallel execution of a single thread, can have a significant impact on performance. For the SPARC ISA (after removing TLB and register window effects), we observe a 4�17% performance difference between a modest out-of-order processor and a hypothetical processor which idealizes serializing instructions. We examine the consumption of values produced by several SIs, and observe that most values are consumed, but that the values are Effectively Useless (EU) � i.e. they do not actually change the execution of the consuming instructions. To improve the performance of such SIs, we propose EU prediction, which can allow younger instructions to proceed, possibly reading a stale value, and yet still correctly execute. This simple technique improves the performance of five of our seven workloads by 8�12%.

Description

Keywords

Related Material and Data

Citation

TR1606

Sponsorship

Endorsement

Review

Supplemented By

Referenced By