CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Concurrent Data Structures for Efficient Streaming Aggregation

Daniel Cederman (Institutionen för data- och informationsteknik, Nätverk och system, Datakommunikation och distribuerade system (Chalmers)) ; Vincenzo Gulisano (Institutionen för data- och informationsteknik, Nätverk och system, Datakommunikation och distribuerade system (Chalmers)) ; Yiannis Nikolakopoulos (Institutionen för data- och informationsteknik, Nätverk och system, Datakommunikation och distribuerade system (Chalmers)) ; Marina Papatriantafilou (Institutionen för data- och informationsteknik, Nätverk och system, Datakommunikation och distribuerade system (Chalmers)) ; Philippas Tsigas (Institutionen för data- och informationsteknik, Nätverk och system, Datakommunikation och distribuerade system (Chalmers))
Göteborg : Chalmers University of Technology, 2013. - 13 s.
[Rapport]

In many data gathering applications, information arrives in the form of continuous streams rather than finite data sets. Efficient one-pass algorithms are required to cope with high input loads. Stream processing engines support continuous queries to process data in a real-time fashion and have evolved rapidly from centralized to distributed, parallel and elastic solutions. While a big effort has been put on leveraging the processing capacity of clusters of machines, less work has focused on leveraging the parallelism enabled by multi-core architectures by means of concurrent and lock-free data structures, to support the pipeline. This paper explores this aspect focusing on multiway aggregation, where large data volumes are received from multiple input streams. Multiway aggregation is crucial in contexts such as sensor networks, social media or clickstream analysis applications. We provide three enhanced aggregate operators that rely on two new concurrent data structures and their lock-free implementations, supporting both order-sensitive and order-insensitive aggregation functions. We provide an extensive study of the properties of the proposed aggregate operators and the new data structures. We also show an extensive experimental evaluation of the proposed methods, giving empirical evidence of their superiority. In this evaluation we run a variety of aggregation queries on two large datasets, one with data extracted from SoundCloud, a music social network, and one with data from a smart grid metering network. In all the experiments, the new data structures improved the aggregation performance significantly, up to one order of magnitude, in terms of both processing throughput and latency.



Den här publikationen ingår i följande styrkeområden:

Läs mer om Chalmers styrkeområden  

Denna post skapades 2013-12-20. Senast ändrad 2015-03-30.
CPL Pubid: 190340