Financial Loan Approval Data
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for predictive modelling of bank loan approvals, framed as a classification problem. It aims to help financial institutions make informed decisions by understanding customer attributes that influence loan acceptance. The dataset provides valuable insights into customer demographics and financial behaviour, such as the observation that customers with higher incomes tend to have fewer family members and are more likely to have their personal loans approved. It also highlights that income levels are similar regardless of credit card ownership, and wealthier customers (income group 150-200) typically hold the highest mortgages. The dataset is notably imbalanced, with a significant majority of customers (4520 out of 5000) not approved for a loan, while 480 were approved.
Columns
- ID: A unique identifier for each customer.
- Age: The customer's age in years.
- Experience: The customer's years of professional experience.
- Income: The customer's annual income.
- ZIP.Code: The customer's address postcode.
- Family: The number of family members in the customer's household.
- CCAvg: The customer's average monthly credit card spending.
- Education: The customer's education level.
- Mortgage: The amount of the customer's mortgage.
- Personal.Loan: The dependent variable, indicating whether a personal loan was approved (1) or not (0).
- Securities.Account: Indicates if the customer has a securities account (1) or not (0).
- CD.Account: Indicates if the customer has a Certificate of Deposit (CD) account (1) or not (0).
- Online: Indicates if the customer uses online banking services (1) or not (0).
- CreditCard: Indicates if the customer holds a credit card with the bank (1) or not (0).
Distribution
The dataset is provided in a CSV format and contains 5000 records (customers) across 14 distinct columns. It is recognised for its imbalanced nature, with 4520 instances where a personal loan was not approved and 480 instances where it was. The data types of certain variables may require conversion to factor vectors for analysis.
Usage
This dataset is ideal for:
- Exploratory Data Analysis (EDA) to uncover patterns and relationships within customer data.
- Developing and evaluating machine learning models such as Logistic Regression, Decision Trees, and Random Forests for predicting loan approval.
- Analysing customer characteristics that contribute to loan approval or rejection, supporting strategic decision-making for bank management.
- Studying the impact of various financial and demographic factors on personal loan outcomes.
- Demonstrating and comparing the performance of different classification algorithms using metrics like Accuracy, Sensitivity, Specificity, and Area Under the Curve (AUC).
Coverage
The dataset covers 5000 individual customers with details on their age, experience, income, family size, credit card habits, education, and mortgage status. While a postcode variable is included, specific geographical or time range coverage is not detailed in the source material.
License
CC0: Public Domain
Who Can Use It
- Data Scientists and Machine Learning Engineers: To build, train, and validate predictive models for loan approval, especially classification algorithms.
- Financial Analysts and Bank Management: To gain insights into customer behaviour, identify key drivers for loan approval, and inform business strategies.
- Academics and Students: For research, educational purposes, and hands-on practice with real-world classification problems and R programming techniques.
- Risk Management Professionals: To develop and refine credit risk assessment models and policies.
Dataset Name Suggestions
- Bank Loan Prediction Dataset
- Customer Personal Loan Status
- Financial Loan Approval Data
- Banking Customer Classification
- Loan Decisioning Factors
Attributes
Original Data Source: Financial Loan Approval Data