Opendatabay APP

Financial Creditworthiness Analysis

Finance & Banking Analytics

Tags and Keywords

Loan

Credit

Underwriting

Finance

Prediction

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Financial Creditworthiness Analysis Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

The data focuses on the personal loan underwriting process, providing detailed attributes necessary to assess the creditworthiness of individual applicants. It is derived from LoanTap, an online platform dedicated to offering flexible and customized loan products primarily to salaried professionals and businessmen. The objective is to build a robust underwriting layer capable of determining whether a credit line should be extended to an applicant and, if so, what the appropriate repayment terms should be. This dataset is a foundation for predictive modelling concerning credit risk.

Columns

The dataset contains 27 fields that capture various financial and biographical details of the borrowers. Key variables include:
  • loan_amnt: The principal amount of the loan applied for by the borrower. Amounts range from 500 to 40,000.
  • term: The duration over which the loan is scheduled to be repaid, typically 36 months or 60 months.
  • int_rate: The annual interest rate applicable to the loan.
  • installment: The calculated monthly repayment amount owed by the borrower.
  • grade and sub_grade: Internal risk classifications of the loan assigned by the institution.
  • emp_title: The job title of the borrower.
  • emp_length: The duration of employment, with '10+ years' being the most common category.
  • home_ownership: Describes the borrower’s home ownership status, with MORTGAGE and RENT being the dominant statuses.
  • annual_inc: The verifiable annual income reported by the borrower.
  • verification_status: Indicates whether the income was verified.
  • issue_d: The date the loan was issued.
  • loan_status: The final outcome of the loan (e.g., Fully Paid or Charged Off). This serves as the primary target variable for prediction models.
  • purpose: The stated reason for obtaining the loan, with debt consolidation being the most frequent.
  • dti: The borrower’s debt-to-income ratio.
  • earliest_cr_line: The date the borrower's earliest reported credit line was opened.
  • open_acc: The number of open credit accounts in the borrower's name.
  • pub_rec: The number of derogatory public records on file for the borrower.
  • mort_acc: The number of mortgage accounts the borrower has.

Distribution

The dataset, titled logistic_regression.csv, totals 100.35 MB. It includes approximately 396,000 records for analysis across its 27 columns. Most fields are fully populated, showing high data integrity (100% validity for critical fields like loan_amnt, term, and loan_status). A small percentage of data is missing for employment title and employment length. The standard application type is INDIVIDUAL.

Usage

This dataset is ideal for:
  • Developing predictive models (e.g., Logistic Regression) to forecast individual credit default risk.
  • Performing Exploratory Data Analysis (EDA) on lending trends and borrower characteristics.
  • Training machine learning models to classify borrowers as creditworthy or non-creditworthy.
  • Formulating specific business recommendations related to setting appropriate repayment terms for various applicant profiles.
  • Studying the relationship between credit history variables (like DTI, income, and public records) and loan repayment success.

Coverage

The data captures loan activity issued chronologically from June 2007 through December 2016. The credit history captured can stretch back as far as January 1944 based on the earliest credit line reported. Coverage primarily details individual applicants, with loan purposes heavily focused on debt consolidation and credit card refinancing.

License

Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Who Can Use It

  • Data Science Professionals: Building and testing financial underwriting models.
  • Credit Risk Analysts: Evaluating risk factors and thresholds for personal loan products.
  • Financial Institutions: Benchmarking internal underwriting processes and improving loan decision speed.
  • Academic Researchers: Studying consumer lending behaviour, economic impact, and financial instrument performance.

Dataset Name Suggestions

  1. Credit Risk Prediction Dataset
  2. Loan Underwriting Features for Individuals
  3. Financial Creditworthiness Analysis
  4. LoanTap Personal Loan Risk Profile

Attributes

Listing Stats

VIEWS

4

DOWNLOADS

1

LISTED

28/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format