A machine learning-based framework for anomaly detection

Ball, Richard Steven

View/Open

Ball_RS.pdf (9.367Mb)

Date

2023

Author

Ball, Richard Steven

Metadata

Show full item record

Abstract

With the expansion of modern technology and the increased adoption of digital payment methods, fraudsters are becoming more sophisticated in their approach. Organisations attempting to mitigate fraudulent attacks can leverage their data to drive systems dedicated to detecting anomalous behaviour. Unfortunately, the techniques used to detect anomalies are often fragmented, focussing on solving a very specific detection problem. In this study, a machine learning-based framework for anomaly detection is proposed that aims to combine multiple machine learning approaches to detect anomalies in transactional systems. The proposed framework includes a unifed approach for combining aspects of the anomaly detection process into a singular pipeline. The approach begins with a neural architecture search, where multiple candidate autoencoder architectures are randomly simulated to determine an optimal architecture configuration. An optimal architecture is determined by evaluating each model with a proposed thresholding algorithm, which calculates the optimal threshold, using a balanced score. The output of the optimal architecture is then transformed from raw anomaly scores into a more manageable score in the range [0,1] through the application of Gaussian scaling. This approach is validated on a public data set for illustration purposes, followed by the application of the approach to a real-world transactional data set. Social network analysis is then introduced to the study, by taking network data and calculating network metrics to generate a feature set for augmenting the initial real-world transactional features. These network metrics capture the behaviours that link users together within a transactional system. Computing the Shapley values of the autoencoder output returns the feature contribution of the network metrics to determine their impact on the model's ability to detect anomalies. The combined network metrics and transactional feature set are then used to train a self-organising map to surface clusters of anomalous activity not detectable through the autoencoder, as well as to provide an additional approach to feature contribution through the exploration of the anomalous cluster weights. Finally, the various modelling approaches are combined into a proposed anomaly detection framework. The framework includes aspects of visual analysis and what if analysis, which provides practitioners with an interface to analyse the outputs of the various machine learning techniques and to perform sensitivity analysis by inputting various classification costs and conducting threshold changes. The final proposed framework was implemented in a real-world setting to illustrate the practical applicability. The obtained results of the implementation successfully achieved the main objective, which was to improve the detection of anomalies in a transactional setting.

URI

https://orcid.org/0000-0001-7076-6157
http://hdl.handle.net/10394/42111

Collections

Natural and Agricultural Sciences [2757]