CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

On Sorting and Load-Balancing on GPUs

Daniel Cederman (Institutionen för data- och informationsteknik, Nätverk och system, Datakommunikation och distribuerade system (Chalmers)) ; Philippas Tsigas (Institutionen för data- och informationsteknik, Nätverk och system, Datakommunikation och distribuerade system (Chalmers))
Proceedings of the 1st Swedish Workshop on Multi-Core Computing (1103-1581). (2008)
[Konferensbidrag, refereegranskat]

In this paper we present GPU-Quicksort, an efficient Quicksort algorithm suitable for highly parallel multi-core graphics processors. Quicksort has previously been considered as an inefficient sorting solution for graphics processors, but we show that GPU-Quicksort often performs better than the fastest known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors. We also present a comparison of different load balancing schemes. To get maximum performance on the manycore graphics processors it is important to have an even balance of the workload so that all processing units contribute equally to the task at hand. This can be hard to achieve when the cost of a task is not known beforehand and when new sub-tasks are created dynamically during execution. With the recent advent of scatter operations and atomic hardware primitives it is now possible to bring some of the more elaborate dynamic load balancing schemes from the conventional SMP systems domain to the graphics processor domain.



Denna post skapades 2009-05-08.
CPL Pubid: 93713

 

Läs direkt!


Länk till annan sajt (kan kräva inloggning)