Opendatabay APP

Financial Credit Risk Data

Finance & Banking Analytics

Tags and Keywords

Loan

Default

Finance

Banking

Prediction

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Financial Credit Risk Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed to help predict which individuals are most likely to default on their loan payments. It provides a valuable opportunity to tackle a significant machine learning problem within the financial services industry. Companies, including big banks and financial institutions, leverage such data to decrease payment defaults and ensure timely loan repayments. By applying machine learning, organisations can identify high-risk individuals, allowing for the effective deployment of appropriate interventions. The dataset originates from Coursera's Loan Default Prediction Challenge, offering a unique resource to test and enhance modelling skills.

Columns

The dataset contains 18 distinct columns, each providing specific details about the loan and the borrower:
  • LoanID: A unique identifier for each loan.
  • Age: The age of the borrower, ranging from 18 to 69.
  • Income: The annual income of the borrower, varying from £15,000 to £150,000.
  • LoanAmount: The specific amount of money borrowed, ranging from £5,000 to £250,000.
  • CreditScore: The credit score of the borrower, from 300 to 849.
  • MonthsEmployed: The number of months the borrower has been employed, from 0 to 119 months.
  • NumCreditLines: The total number of open credit lines the borrower possesses, from 1 to 4.
  • InterestRate: The interest rate applied to the loan, between 2% and 25%.
  • LoanTerm: The duration of the loan in months, with terms of 12, 24, 36, 48, or 60 months.
  • DTIRatio: The Debt-to-Income ratio, ranging from 0.1 to 0.9.
  • Education: The highest level of education attained by the borrower, including Bachelor's, High School, or Other.
  • EmploymentType: The borrower's employment status, such as Part-time, Unemployed, or Other.
  • MaritalStatus: The marital status of the borrower, e.g., Married, Divorced, or Other.
  • HasMortgage: A boolean indicating whether the borrower has a mortgage (true/false).
  • HasDependents: A boolean indicating whether the borrower has dependents (true/false).
  • LoanPurpose: The stated purpose for which the loan was taken, such as Business, Home, or Other.
  • HasCoSigner: A boolean indicating if the loan has a co-signer (true/false).
  • Default: The target variable, indicating whether the loan defaulted or not (0 or 1).

Distribution

The dataset is provided as a CSV file, Loan_default.csv, and has a size of 24.83 MB. It contains 255,347 rows and 18 columns. All data points within the dataset are valid, with no mismatched or missing values reported across any of the columns.

Usage

This dataset is ideal for a variety of applications, particularly in the realm of financial risk assessment and machine learning:
  • Predicting Loan Defaults: Develop and evaluate machine learning models to forecast which individuals are at the highest risk of defaulting on their loan payments.
  • Risk Management: Aid financial institutions in identifying and managing credit risk more efficiently.
  • Customer Segmentation: Segment borrowers based on their likelihood of default for targeted interventions.
  • Model Development: Serve as a robust dataset for training and testing binary classification models.
  • Academic Research: Support research into factors influencing loan repayment behaviour and financial stability.

Coverage

The dataset focuses on various demographic aspects of borrowers, including their age (18-69), annual income (£15,000-£150,000), credit score (300-849), and employment history (0-119 months). It also includes details on education level, employment type, and marital status. The dataset is sourced from Coursera's Loan Default Prediction Challenge, but specific geographic or time range coverage is not detailed within the available information.

License

CC0: Public Domain

Who Can Use It

This dataset is particularly useful for:
  • Financial Analysts: To assess and mitigate loan default risks.
  • Data Scientists and Machine Learning Engineers: To build predictive models for credit risk.
  • Banks and Lending Institutions: To improve decision-making processes for loan approvals and debt recovery strategies.
  • Academic Researchers and Students: For studies on financial behaviour, risk assessment, and machine learning applications in finance.

Dataset Name Suggestions

  • Loan Default Prediction Dataset
  • Financial Credit Risk Data
  • Borrower Default Analytics
  • Loan Repayment Predictor
  • Credit Default Assessment

Attributes

Original Data Source: Financial Credit Risk Data

Listing Stats

VIEWS

11

DOWNLOADS

6

LISTED

22/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format