Dark Mode

Home

Data Categories

Financial Data

Financial Delinquency Prediction Data

FREE DATASET LIBRARY

Verified Data Provider

£0

Financial Delinquency Prediction Data

Finance & Banking Analytics

Tags and Keywords

Credit

Risk

Finance

Delinquency

Automl

Trusted By

Financial Delinquency Prediction Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This resource is designed as a structured benchmark for automated machine learning (AutoML) and predictive modeling, specifically focusing on the financial domain. The main objective is credit risk assessment, which involves predicting whether a borrower will experience serious delinquency (90 days or more late on a payment) within the upcoming two years. The data incorporates a blend of crucial personal attributes and financial metrics, enabling users to develop, test, and rigorously evaluate various credit scoring models.

Columns

The dataset includes 10 predictor variables and 1 target variable:

rev_util: The ratio of the total outstanding balance on revolving credit lines relative to the total credit limit available on those accounts. This reflects the borrower's utilization of available credit.
age: The age of the borrower, measured in years.
late_30_59: The count of instances where the borrower was 30 to 59 days past due on a payment, but not worse. This measures short-term delinquency behaviour.
debt_ratio: The ratio of the borrower’s monthly debt payments (including alimony and loans) to their monthly gross income, indicating overall debt burden.
monthly_inc: The gross income the borrower receives each month.
open_credit: The total number of open instalment loans and revolving credit lines the borrower possesses.
late_90: The count of times the borrower has been 90 days or more late on a payment, signifying severe delinquency issues.
real_estate: The count of credit lines or loans secured by real estate, such as mortgages or home equity lines.
late_60_89: The count of times the borrower was 60 to 89 days past due on a payment, providing insight into mid-term delinquency behaviour.
dependents: The count of individuals who are financially dependent on the borrower.
dlq_2yrs: The binary target variable: 1 if the borrower experienced a serious delinquency in the next two years, and 0 otherwise.

Distribution

The dataset structure is suitable for binary classification tasks, featuring 10 numerical predictors and 1 binary target variable. The data file is titled Credit Risk Benchmark Dataset.csv and has a size of 1.02 MB. While the exact number of rows is not specified, it contains approximately 16.7 thousand valid records. Users are advised to perform exploratory data analysis, manage potential missing values or outliers, and experiment with feature engineering techniques like scaling and transformation.

Usage

This data product is highly suitable for several predictive and analytical applications:

Risk Management: Developing and validating robust credit scoring models aimed at forecasting borrower default risks accurately.
AutoML Benchmarking: Evaluating and comparing the efficiency and performance of diverse AutoML frameworks on a standardised, industry-relevant financial dataset.
Academic Research: Conducting investigations into trends and underlying relationships in credit behaviour, alongside analysing the predictive utility of various financial indicators.
Model Interpretability: Given that financial models are heavily regulated, the dataset offers an excellent foundation for testing feature importance and generating explainable AI (XAI) solutions that ensure necessary transparency.

Coverage

The dataset focuses on capturing key demographic (age, dependents) and financial indicators (income, credit usage, debt burden) of individual borrowers. Crucially, it tracks short-, mid-, and long-term delinquency histories. The prediction task relates to the likelihood of severe delinquency over a two-year future period.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

Data Scientists: For testing machine learning algorithms, including modern approaches like neural networks and gradient boosting, against classical methods like logistic regression.
Financial Analysts: For building models used in internal risk systems and determining optimal lending practices.
Researchers: To study the factors that drive serious credit default and the relationships between personal finance variables.

Dataset Name Suggestions

Credit Default Risk Benchmark
Financial Delinquency Prediction Data
Borrower Credit Scoring Indicators
Two-Year Delinquency Forecast Data

Attributes

Original Data Source: Financial Delinquency Prediction Data

Listing Stats

VIEWS

DOWNLOADS

LISTED

10/10/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Financial Delinquency Prediction Data

Finance & Banking Analytics

Tags and Keywords

Credit

Risk

Finance

Delinquency

Automl

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS