CMP Directory Coherence: One Granularity Does Not Fit All

Loading...
Thumbnail Image

Authors

Reinhardt, Steven K.
Hill, Mark
Beckmann, Bradford M.
Basu, Arkaprava

Advisors

License

DOI

Type

Technical Report

Journal Title

Journal ISSN

Volume Title

Publisher

Grantor

Abstract

To support legacy software, large CMPs often provide cache coherence via an on-chip directory rather than snooping. In those designs, a key challenge is maximizing the effectiveness of precious on-chip directory state. Most current directory protocols miss an opportunity by organizing all state in per-block records. To increase the "reach" of on-chip directory state, we apply ideas from snooping region coherence to develop a dual-grain CMP directory protocol. First, we trade enable a tradeoff between unnecessary probes (e.g., invalidations) and on-chip directory storage size by organizing a directory entry with both per-1KB-region state and per-64B-block state. Second, to optimize for sparsely accessed regions, we evaluate an asymmetric dual-granularity directory, wherein some entries are smaller and can hold only one block per region rather than as many as to 16. Results with commercial and PARSEC workloads on a 16-node CMP show that the new dualgrain CMP directory design uses less space, or usually does fewer unnecessary probes, than conventional designs and eliminates directory accesses for many private blocks.

Description

Related Material and Data

Citation

TR1798

Sponsorship

Endorsement

Review

Supplemented By

Referenced By