CMP Directory Coherence: One Granularity Does Not Fit All
Loading...
Date
Authors
Reinhardt, Steven K.
Hill, Mark
Beckmann, Bradford M.
Basu, Arkaprava
Advisors
License
DOI
Type
Technical Report
Journal Title
Journal ISSN
Volume Title
Publisher
Grantor
Abstract
To support legacy software, large CMPs often provide cache coherence via an on-chip
directory rather than snooping. In those designs, a key challenge is maximizing the effectiveness
of precious on-chip directory state. Most current directory protocols miss an opportunity by
organizing all state in per-block records.
To increase the "reach" of on-chip directory state, we apply ideas from snooping region
coherence to develop a dual-grain CMP directory protocol. First, we trade enable a tradeoff
between unnecessary probes (e.g., invalidations) and on-chip directory storage size by
organizing a directory entry with both per-1KB-region state and per-64B-block state. Second, to
optimize for sparsely accessed regions, we evaluate an asymmetric dual-granularity directory,
wherein some entries are smaller and can hold only one block per region rather than as many as
to 16.
Results with commercial and PARSEC workloads on a 16-node CMP show that the new dualgrain
CMP directory design uses less space, or usually does fewer unnecessary probes, than
conventional designs and eliminates directory accesses for many private blocks.
Description
Related Material and Data
Citation
TR1798