US Small Business Loans Dataset
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains extensive historical loan performance data from the U.S. Small Business Administration (SBA). The SBA, established in 1953, supports and promotes small enterprises within the U.S. credit market. Small businesses are a key driver of job creation in the United States, making the fostering of their formation and growth beneficial for employment and economic stability. The dataset allows for the analysis of successful SBA loan guarantees, such as those received by FedEx and Apple Computer, as well as instances where small businesses have defaulted on their guaranteed loans. It is designed to help determine whether a loan should be approved or denied, providing a rich resource for risk assessment and predictive modelling.
Columns
- LoanNr_ChkDgt: Identifier, primary key.
- Name: Borrower name.
- City: Borrower city.
- State: Borrower state.
- Zip: Borrower zip code.
- Bank: Bank name.
- BankState: Bank state.
- NAICS: North American Industry Classification System code (first two digits describe the sector, e.g., 11 for Agriculture, 22 for Utilities, 31-33 for Manufacturing, 51 for Information, 52 for Finance and Insurance, 54 for Professional, scientific, and technical services, 72 for Accommodation and food services).
- ApprovalDate: Date SBA commitment issued.
- ApprovalFY: Fiscal year of commitment.
- Term: Loan term in months.
- NoEmp: Number of business employees.
- NewExist: 1 = Existing business, 2 = New business.
- CreateJob: Number of jobs created.
- RetainedJob: Number of jobs retained.
- FranchiseCode: Franchise code, (00000 or 00001) = No franchise.
- UrbanRural: 1 = Urban, 2 = rural, 0 = undefined.
- RevLineCr: Revolving line of credit: Y = Yes, N = No.
- LowDoc: LowDoc Loan Program: Y = Yes, N = No.
- ChgOffDate: The date when a loan is declared to be in default.
- DisbursementDate: Disbursement date.
- DisbursementGross: Amount disbursed.
- BalanceGross: Gross amount outstanding.
- MIS_Status: Loan status (charged off = CHGOFF, Paid in full = PIF).
- ChgOffPrinGr: Charged-off amount.
- GrAppv: Gross amount of loan approved by bank.
- SBA_Appv: SBA’s guaranteed amount of approved loan.
Distribution
This dataset is large, featuring 899,164 rows and 27 columns. It is typically available as a CSV file and is approximately 179.43 MB in size. The number of records is clearly defined.
Usage
This dataset is ideal for:
- Developing machine learning models to predict loan default risk.
- Analysing factors contributing to small business success or failure.
- Understanding the impact of SBA loan guarantees on job creation and retention.
- Conducting economic research on small business finance and credit markets.
- Supporting policy decisions related to small business support programmes.
- Performing detailed financial analysis of loan portfolios.
Coverage
The dataset focuses on small business loans in the United States. Loan approval dates range from 7th December 1961 to 25th June 2014, while disbursement dates span from 17th September 1948 to 18th June 2028. Default dates (ChgOffDate) are recorded from 3rd October 1988 to 22nd October 2026. Data availability covers various geographic locations within the U.S. (City, State, Zip) and includes businesses across different North American Industry Classification System (NAICS) sectors.
License
CC BY-SA 4.0
Who Can Use It
This dataset is valuable for a wide range of users, including:
- Data Scientists and Machine Learning Engineers: For building predictive models on loan performance and risk assessment.
- Financial Analysts and Credit Risk Managers: For evaluating loan portfolios and understanding default patterns.
- Economists and Researchers: For studying small business dynamics, job creation, and government lending programmes.
- Policymakers and Government Agencies: For informing decisions on small business support and economic development initiatives.
- Academics and Students: For educational purposes and academic research in finance, economics, and data science.
Dataset Name Suggestions
- SBA Loan Performance Data
- US Small Business Loans Dataset
- Small Business Loan Default Records
- SBA Guaranteed Loan Analysis Data
- US Business Credit Performance
Attributes
Original Data Source: US Small Business Loans Dataset