Dark Mode

Home

Data Categories

Synthetic Data

Synthetic Financial Fraud Transactions

FREE DATASET LIBRARY

Verified Data Provider

£0

Synthetic Financial Fraud Transactions

Synthetic Data Generation

Tags and Keywords

Fraud

Banking

Synthetic

Transactions

Xgboost

Trusted By

Synthetic Financial Fraud Transactions Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This synthetic dataset for fraud detection is meticulously designed to assist data scientists and machine learning enthusiasts in developing robust fraud detection models. It contains realistic synthetic transaction data, encompassing user details, various transaction types, and calculated risk scores. This dataset is ideally suited for binary classification tasks, especially with machine learning models such as XGBoost and LightGBM, and is highly beneficial for anomaly detection, risk analysis, and security research. It features 21 attributes with a realistic blend of numerical, categorical, and temporal data, including binary fraud labels (0 for not fraud, 1 for fraud).

Columns

Transaction_ID: A unique identifier for each individual transaction.
User_ID: A distinct identifier assigned to each user.
Transaction_Amount: The monetary value involved in the transaction, with amounts typically ranging from approximately 28.7 to 1,170.
Transaction_Type: Categorical variable indicating the nature of the transaction, such as Online, In-Store, ATM, or POS.
Timestamp: The date and time when the transaction occurred, covering a period from 1st January 2023 to 31st December 2023.
Account_Balance: The user's account balance immediately prior to the transaction, ranging from approximately 500 to 100,000.
Device_Type: Indicates the type of device used for the transaction, including Mobile, Desktop, and Tablet.
Location: The geographical location where the transaction took place, with examples including Tokyo and Mumbai.
Merchant_Category: The type of merchant involved in the transaction, such as Retail, Food, Travel, Clothing, or Groceries.
IP_Address_Flag: A binary indicator (0 or 1) denoting whether the IP address used for the transaction was flagged as suspicious.
Previous_Fraudulent_Activity: The count of past fraudulent activities associated with the user.
Daily_Transaction_Count: The total number of transactions made by the user on that particular day, typically ranging from 1 to 14.
Avg_Transaction_Amount_7d: The user's average transaction amount over the preceding 7 days, typically ranging from 10 to 500.
Failed_Transaction_Count_7d: The number of failed transactions by the user within the last 7 days, typically ranging from 0 to 4.
Card_Type: The type of payment card utilised, such as Credit, Debit, Prepaid, Mastercard, or Visa.
Card_Age: The age of the payment card in months, typically ranging from 1 to 239 months.
Transaction_Distance: The geographical distance between the user's usual location and the transaction location, typically ranging from 0.25 to 5,000 units.
Authentication_Method: The method employed by the user for authentication, including PIN or Biometric.
Risk_Score: A calculated fraud risk score for the transaction, ranging from 0 to 1.
Is_Weekend: A binary indicator (0 or 1) specifying whether the transaction occurred on a weekend.
Fraud_Label: The target variable, indicating whether the transaction is fraudulent (1) or not fraudulent (0).

Distribution

This dataset is provided in CSV format and has a file size of 7.02 MB. It comprises 50,000 individual records or rows and includes 21 distinct columns. The data structure is varied, featuring numerical, categorical, and temporal fields, which aids in creating sophisticated analytical models.

Usage

This dataset is ideally suited for a variety of applications, including:

Training fraud detection models, particularly for binary classification.
Anomaly detection within financial transactions.
Developing and evaluating risk scoring systems for financial institutions such as banks and fintech companies.
Feature engineering and model explainability research in the domain of financial security.

Coverage

The dataset focuses on transactional activities over a time range from 1st January 2023 to 31st December 2023. While it includes geographical transaction locations like Tokyo and Mumbai, it does not specify demographic information beyond user-related transactional patterns and device usage (Mobile, Desktop, Tablet). The data reflects various merchant categories and authentication methods.

License

CC0: Public Domain

Who Can Use It

This dataset is primarily intended for data scientists and machine learning enthusiasts. It is especially useful for those looking to:

Build and test robust fraud detection models.
Perform binary classification tasks.
Conduct anomaly detection, risk analysis, and security research related to financial transactions.

Dataset Name Suggestions

Synthetic Financial Fraud Transactions
ML Fraud Detection Dataset 2023
Digital Transaction Risk Model Data
Fraudulent Transaction Simulation
Financial Security Analysis Dataset

Attributes

Original Data Source: Synthetic Financial Fraud Transactions

Listing Stats

VIEWS

274

DOWNLOADS

LISTED

22/08/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Synthetic Financial Fraud Transactions

Synthetic Data Generation

Tags and Keywords

Fraud

Banking

Synthetic

Transactions

Xgboost

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS