A synthetic fraud data generation methodology

Emilie Lundin (Institutionen för datorteknik) ; Håkan Kvarnström (Institutionen för datorteknik) ; Erland Jonsson (Institutionen för datorteknik)
Lecture Notes in Computer Science. Proceedings - 4th International Conference on Information and Communications Security (ICICS 2002), Singapore, 9-12 Dec. 2002 (0302-9743). Vol. 2513 (2002), p. 265-277.
In many cases synthetic data is more suitable than authentic data for the testing and training of fraud detection systems. At the same time synthetic data suffers from some drawbacks originating from the fact that it is indeed synthetic and may not have the realism of authentic data. In order to counter this disadvantage, we have developed a method for generating synthetic data that is derived from authentic data. We identify the important characteristics of authentic data and the frauds we want to detect and generate synthetic data with these properties.

Nyckelord: fraud detection, synthetic test data, data generation methodology, user simulation, system simulation

