CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Runtime-guided cache coherence optimizations in multi-core architectures

Madhavan Manivannan (Institutionen för data- och informationsteknik, Datorteknik (Chalmers)) ; Per Stenström (Institutionen för data- och informationsteknik, Datorteknik (Chalmers))
Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS (1530-2075). p. 625-636. (2014)
[Konferensbidrag, refereegranskat]

Emerging task-based parallel programming models shield programmers from the daunting task of parallelism management by delegating the responsibility of mapping and scheduling of individual tasks to the runtime system. The runtime system can use semantic information about task dependencies supplied by the programmer and the mapping information of tasks to enable optimizations like data-flow based execution and locality-aware scheduling of tasks. However, should the cache coherence substrate have access to this information from the runtime system, it would enable aggressive optimizations of prevailing access patterns such as one-to-many producer-consumer sharing and migratory sharing. Such linkage has however not been studied before. We present a family of runtime guided cache coherence optimizations enabled by linking dependency and mapping information from the runtime system to the cache coherence substrate. By making this information available to the cache coherence substrate, we show that optimizations, such as downgrading and self-invalidation, that help reducing overheads associated with producer-consumer and migratory sharing can be supported with reasonable extensions to the baseline cache coherence protocol. Our experimental results establish that each optimization provides significant performance gain in isolation and can provide additional gains when combined. Finally, we evaluate these optimizations in the context of earlier proposed runtime-guided prefetching schemes and show that they can have synergistic effects.

Nyckelord: cache coherence, prefetching, runtime system, self invalidation, task parallelism


Article number 6877295



Den här publikationen ingår i följande styrkeområden:

Läs mer om Chalmers styrkeområden  

Denna post skapades 2014-09-11. Senast ändrad 2015-01-15.
CPL Pubid: 202653

 

Läs direkt!


Länk till annan sajt (kan kräva inloggning)