Simple Loan Classification Dataset
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for classification tasks, focusing on predicting loan approval or denial. It incorporates various demographic and financial characteristics of individuals, with the 'loan_status' column serving as the primary target variable for classification modelling.
Columns
- age (int): Represents the age of the individual. Values in the dataset range from 24 to 55, with a mean age of 37.1 and a standard deviation of 8.36.
- gender (string): Indicates the gender of the individual, categorised as either 'Male' or 'Female'. In the dataset, 51% are Male and 49% are Female.
- occupation (string): Describes the individual's occupation. The dataset includes 38 unique occupations, with 'Engineer' being the most common at 8%, and 'Teacher' at 3%, while 89% fall into other categories.
- education_level (string): Details the highest level of education attained. Levels include 'High School', 'Associate's', 'Bachelor's', 'Master's', and 'Doctoral'. 'Bachelor's' is the most frequent at 38%, followed by 'Master's' at 25%.
- marital_status (string): Specifies the individual's marital status, categorised as 'Single' or 'Married'. 'Married' individuals account for 61% of the dataset, with 'Single' individuals making up 39%.
- income (int): Represents the individual's annual income in dollars. In the dataset, incomes range from £25,000 to £180,000, with a mean of £79,000 and a standard deviation of £33,500.
- credit_score (int): The individual's credit score, which varies from 560 to 830 in this dataset. The mean credit score is 710, with a standard deviation of 72.1.
- loan_status (string): This is the target variable for classification, indicating whether a loan application was 'Approved' or 'Denied'. In the dataset, 74% of applications were 'Approved' and 26% were 'Denied'.
Distribution
The dataset is provided as a CSV file, named 'loan.csv', and has a file size of 3.45 kB. It consists of 8 columns and contains 61 records. Each column is fully populated, with no missing values observed.
Usage
This dataset is ideal for developing and evaluating classification models. Potential applications include predicting loan approval or denial, assessing credit risk, and understanding the factors influencing lending decisions. It is suitable for machine learning techniques such as Logistic Regression and Decision Tree models.
Coverage
The dataset focuses on demographic and financial characteristics of individuals, including their age, gender, occupation, education level, marital status, income, and credit score. No specific geographic or time range is noted in the provided details.
License
CC BY-SA 4.0
Who Can Use It
This dataset is particularly useful for data scientists, machine learning engineers, financial analysts, and researchers interested in credit risk assessment and predictive modelling in the banking sector. It can support tasks such as building automated loan approval systems or conducting studies on lending patterns.
Dataset Name Suggestions
- Simple Loan Classification Dataset
- Loan Approval Prediction Dataset
- Credit Risk Classification Data
- Personal Loan Status Dataset
- Financial Demographics for Lending
Attributes
Original Data Source: Simple Loan Classification Dataset