CPL - Chalmers Publication Library

# NB-FEB: A Universal Scalable Easy-to-Use Synchronization Primitive for Manycore Architectures

Phuong Ha ; Philippas Tsigas (Institutionen för data- och informationsteknik, Nätverk och system, Datakommunikation och distribuerade system (Chalmers)) ; Otto Anshus
Proceedings of the 13th International Conference on Principle of Distributed Systems (OPODIS 2009), Lecture Notes in Computer Science (1611-3349). Vol. 5923 (2009), p. 189-203.
[Konferensbidrag, refereegranskat]

his paper addresses the problem of universal synchronization primitives that can support scalable thread synchronization for large-scale manycore architectures. The universal synchronization primitives that have been deployed widely in conventional architectures, are the compare-and-swap (CAS) and load-linked/store-conditional (LL/SC) primitives. However, such synchronization primitives are expected to reach their scalability limits in the evolution to manycore architectures with thousands of cores. We introduce a non-blocking full/empty bit primitive, or NB-FEB for short, as a promising synchronization primitive for parallel programming on manycore architectures. We show that the NB-FEB primitive is universal, scalable, feasible and easy to use. NB-FEB, together with registers, can solve the consensus problem for an arbitrary number of processes (universality). NB-FEB is combinable, namely its memory requests to the same memory location can be combined into only one memory request, which consequently makes NB-FEB scalable (scalability). Since NB-FEB is a variant of the original full/empty bit that always returns a value instead of waiting for a conditional flag, it is as feasible as the original full/empty bit, which has been implemented in many computer systems (feasibility). We construct, on top of NB-FEB, a non-blocking software transactional memory system called NBFEB-STM, which can be used as an abstraction to handle concurrent threads easily. NBFEB-STM is space efficient: the space complexity of each object updated by N concurrent threads/transactions is ${\it \Theta}(N)$, which is optimal.

CPL Pubid: 106277

# Läs direkt!

Länk till annan sajt (kan kräva inloggning)