20th AIAI 2024, 27 - 30 June 2024, Corfu, Greece

Synthetic Data Generation and Impact Analysis of Machine Learning Models for Enhanced Credit Card Fraud Detection

Ahmed Abdullah Khaled, Md Mahmudul Hasan, Shareeful Islam, Spyridon Papastergiou, Haralambos Mouratidis

Abstract:

  The financial industry is currently experiencing a substantial shift in its operating landscape because of the swift integration of technology. This transformation brings with it potential risks and challenges. Heightened occurrence of online fraud is one the key concerns for this sector, which has been exacerbated by the growing prevalence of online payment methods on e-commerce platforms and other websites. The identification of credit card fraud is a challenging task due to nature of imbalanced transactional data to detect and predict any fraudulent activities. In this context, this paper provides a unique approach to create synthetic dataset to tackle imbalanced issue for credit card fraud detection. The approach adopts Synthetic Minority Over-sampling Technique (SMOTE) technique for balancing dataset. An experiment is performed using several ML models including SVM (Support Vector Machines), KNN (K-Nearest Neighbours), and Random Forest to demonstrate the feasibility of using synthetic data. In this study, we have combined resampling techniques like SMOTE for oversampling the minority class with ensemble methods and appropriate evaluation metrics like the F1-score to improve the imbalanced data. The result from the experiment compared with widely used public datasets to evaluate the model performance. The analysis reveals an imbalance in the real ULB (Université Libre de Bruxelles) dataset, with the positive class (frauds) comprising a mere 0.172% of all transactions. The findings clearly show that the Random Forest model performs better than other modes with outstanding precision, recall, accuracy, and F1 score values to detect fraudulent transactions and reduce false positives.  

*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.