CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Towards Understanding the Social Structure of Email and Spam Traffic

Farnaz Moradi (Institutionen för data- och informationsteknik, Nätverk och system (Chalmers) )
Göteborg : Chalmers University of Technology, 2012. - 123 s.

Email is a pervasive means of communication on the Internet. Email exchanges between individuals can be seen as social interactions between email sender(s) and receiver(s), thus can be represented as a network. Networks of human interactions such as friendship relations, research collaborations, and phone calls have been widely studied before to allow understanding of the characteristics, as well as the structure and dynamics of such social interactions. In this thesis, we look into the social network properties of email networks generated from real traffic, and investigate how a vast amount of unsolicited email traffic (spam) affect these properties. Current advances in Internet data collection and processing has facilitated the study of the characteristics of email traffic observed on the Internet. In our study, we have collected large-scale email datasets from traffic traversing a high-speed Internet backbone link and have generated email networks from the observed communications to analyze the structure and dynamics of these social interactions. Moreover, we aim at unveiling the distinguishing characteristics of legitimate and unsolicited email communications. We show that the networks of legitimate email traffic has the same structural and temporal properties that other social networks exhibit, and therefore can be modeled as small-world scale-free networks. However, the unsolicited email communications cause deviations and anomalies in the structure of email networks, and this deviation from the expected social structural properties can be used to find the sources of spam email. We also show that email networks, similar to other social networks, have a community structure which can be found using different community detection algorithms. However, not all community detection algorithms can identify structural communities that coincide with the true logical communities of email networks, i.e., distinct communities of legitimate and unsolicited email. Our study shows that a link-based community detection algorithm is more suitable for this purpose than more widely used node-based algorithms. The possibility of merely using the social structure of email traffic to identify the source of spam and separate the unsolicited email from legitimate email, can potentially be used to improve the protection against spam and other types of malicious activities on the Internet.

Nyckelord: Internet Backbone Traffic, Email Networks, Social Network Analysis, Spam, Community Detection, Anomaly Detection

Denna post skapades 2012-10-18. Senast ändrad 2013-03-28.
CPL Pubid: 164884


Läs direkt!

Lokal fulltext (fritt tillgänglig)

Institutioner (Chalmers)

Institutionen för data- och informationsteknik, Nätverk och system (Chalmers)



Chalmers infrastruktur

Relaterade publikationer

Inkluderade delarbeten:

On Collection of Large-Scale Multi-Purpose Datasets on Internet Backbone Links

Towards Modeling Legitimate and Unsolicited Email Traffic Using Social Network Properties

An Evaluation of Community Detection Algorithms on Large-Scale Email Traffic


Datum: 2012-11-14
Tid: 13:15
Lokal: HC3, Hörsalsvägen 14, Chalmers University of Technology
Opponent: Dr. Thomas Karagiannis, Microsoft Research, Cambridge

Ingår i serie

Technical report L - Department of Computer Science and Engineering, Chalmers University of Technology and Göteborg University 98L