CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Improving Data Access Efficiency by Using a Tagless Access Buffer (TAB)

Alen Bardizbanyan (Institutionen för data- och informationsteknik, Datorteknik (Chalmers)) ; Peter Gavin ; David Whalley ; Magnus Själander (Institutionen för data- och informationsteknik, Datorteknik (Chalmers)) ; Per Larsson-Edefors (Institutionen för data- och informationsteknik, Datorteknik (Chalmers)) ; Sally A McKee (Institutionen för data- och informationsteknik, Datorteknik (Chalmers)) ; Per Stenström (Institutionen för data- och informationsteknik, Datorteknik (Chalmers))
Proceedings of the International Symposium on Code Generation and Optimization (CGO), Shenzhen, China, Feb. 23-27 (2164-2397). p. 269-279. (2013)
[Konferensbidrag, refereegranskat]

The need for energy efficiency continues to grow for many classes of processors, including those for which performance remains vital. Data cache is crucial for good performance, but it also represents a significant portion of the processor's energy expenditure. We describe the implementation and use of a tagless access buffer (TAB) that greatly improves data access energy efficiency while slightly improving performance. The compiler recognizes memory reference patterns within loops and allocates these references to a TAB. This combined hardware/software approach reduces energy usage by (1) replacing many level-one data cache (L1D) accesses with accesses to the smaller, more power-efficient TAB; (2) removing the need to perform tag checks or data translation lookaside buffer (DTLB) lookups for TAB accesses; and (3) reducing DTLB lookups when transferring data between the L1D and the TAB. Accesses to the TAB occur earlier in the pipeline, and data lines are prefetched from lower memory levels, which result in asmall performance improvement. In addition, we can avoid many unnecessary block transfers between other memory hierarchy levels by characterizing how data in the TAB are used. With a combined size equal to that of a conventional 32-entry register file, a four-entry TAB eliminates 40% of L1D accesses and 42% of DTLB accesses, on average. This configuration reduces data-access related energy by 35% while simultaneously decreasing execution time by 3%.

Nyckelord: energy, memory hierarchy, strided access



Den här publikationen ingår i följande styrkeområden:

Läs mer om Chalmers styrkeområden  

Denna post skapades 2013-02-25. Senast ändrad 2016-09-14.
CPL Pubid: 174063

 

Läs direkt!

Lokal fulltext (fritt tillgänglig)

Länk till annan sajt (kan kräva inloggning)