Dark Mode

Home

Data Categories

Consumer & Product Data

Financial Fraud Detection Dataset

Medical Sphere

Verified Data Provider

£0

Financial Fraud Detection Dataset

Fraud Detection & Risk Management

Tags and Keywords

Credit

Card

Fraud

Detection

Financial

Trusted By

Financial Fraud Detection Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed to support research and model development in the area of fraud detection. It consists of real-world credit card transactions made by European cardholders over a two-day period in September 2013. Out of 284,807 transactions, 492 are labeled as fraudulent (positive class), making this a highly imbalanced classification problem.

Performance Note:

Due to the extreme class imbalance, standard accuracy metrics are not informative. We recommend using the Area Under the Precision-Recall Curve (AUPRC) or F1-score for model evaluation.

Features:

Time Series Data: Each row represents a transaction, with the Time feature indicating the number of seconds elapsed since the first transaction.
Dimensionality Reduction Applied: Features V1 through V28 are anonymized principal components derived from a PCA transformation due to confidentiality constraints.
Raw Transaction Amount: The Amount field reflects the transaction value, useful for cost-sensitive modeling.
Binary Classification Target: The Class label is 1 for fraud and 0 for legitimate transactions.

Usage:

Machine learning model training for fraud detection.
Evaluation of anomaly detection and imbalanced classification methods.
Development of cost-sensitive learning approaches using the Amount variable.

Data Summary:

Total Records: 284,807
Fraud Cases: 492
Imbalance Ratio: Fraudulent transactions account for just 0.172% of the dataset.
Columns: 31 total (28 PCA features, plus Time, Amount, and Class)

License:

The dataset is provided under the CC0 (Public Domain) license, allowing users to freely use, modify, and distribute the data without any restrictions.

Acknowledgements

The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project

Please cite the following works:

Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015

Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon

Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE

Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)

Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier

Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing

Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019

Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019

Yann-Aël Le Borgne, Gianluca Bontempi Reproducible machine Learning for Credit Card Fraud Detection - Practical Handbook

Bertrand Lebichot, Gianmarco Paldino, Wissam Siblini, Liyun He, Frederic Oblé, Gianluca Bontempi Incremental learning strategies for credit cards fraud detection, IInternational Journal of Data Science and Analytics

Listing Stats

VIEWS

473

DOWNLOADS

LISTED

23/08/2024

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

Medical Sphere

£0

Financial Fraud Detection Dataset

Fraud Detection & Risk Management

Tags and Keywords

Credit

Card

Fraud

Detection

Financial

Trusted By

Free

About

Performance Note:

Features:

Usage:

Data Summary:

License:

Acknowledgements

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS