CMP Directory Coherence: One Granularity Does Not Fit All
| dc.contributor.author | Reinhardt, Steven K. | |
| dc.contributor.author | Hill, Mark | |
| dc.contributor.author | Beckmann, Bradford M. | |
| dc.contributor.author | Basu, Arkaprava | |
| dc.date.accessioned | 2013-07-12T16:27:20Z | |
| dc.date.available | 2013-07-12T16:27:20Z | |
| dc.date.issued | 2013-07-11 | |
| dc.description.abstract | To support legacy software, large CMPs often provide cache coherence via an on-chip directory rather than snooping. In those designs, a key challenge is maximizing the effectiveness of precious on-chip directory state. Most current directory protocols miss an opportunity by organizing all state in per-block records. To increase the "reach" of on-chip directory state, we apply ideas from snooping region coherence to develop a dual-grain CMP directory protocol. First, we trade enable a tradeoff between unnecessary probes (e.g., invalidations) and on-chip directory storage size by organizing a directory entry with both per-1KB-region state and per-64B-block state. Second, to optimize for sparsely accessed regions, we evaluate an asymmetric dual-granularity directory, wherein some entries are smaller and can hold only one block per region rather than as many as to 16. Results with commercial and PARSEC workloads on a 16-node CMP show that the new dualgrain CMP directory design uses less space, or usually does fewer unnecessary probes, than conventional designs and eliminates directory accesses for many private blocks. | en |
| dc.identifier.citation | TR1798 | en |
| dc.identifier.uri | http://digital.library.wisc.edu/1793/66144 | |
| dc.subject | Chip Multiprocessors | en |
| dc.subject | Energy Efficiency | en |
| dc.subject | Cache Coherence | en |
| dc.title | CMP Directory Coherence: One Granularity Does Not Fit All | en |
| dc.type | Technical Report | en |