California SBA Guaranteed Loan Outcomes
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides insights into Small Business Administration (SBA) guaranteed loans, focusing specifically on the Real Estate and Rental and Leasing industry within California. The SBA, established in 1953, aims to support small businesses, recognising their vital role in job creation and economic growth within the United States. One of the primary ways the SBA assists is by guaranteeing bank loans, which mitigates risk for lenders and encourages investment in small enterprises. This dataset allows for analysis of loan repayment status, indicating whether a loan was paid off in full or defaulted, with the SBA covering the guaranteed amount in case of default. It serves as a valuable resource for examining the factors that influence loan outcomes and for building predictive models to assess loan approval viability. The original, broader dataset encompasses historical data from 1987 through 2014, and this specific subset contains 2,102 observations and 35 variables. The 'Default' column, initially an integer, has been converted into a factor for ease of use.
Columns
- LoanNr_ChkDgt: Unique Identifier – Primary key for the loan.
- Name: The name of the borrower.
- City: The borrower's city.
- State: The borrower's state.
- Zip: The borrower's ZIP Code.
- Bank: The name of the bank that issued the loan.
- BankState: The state where the bank is located.
- NAICS: The North American Industry Classification System code for the borrower's industry.
- ApprovalDate: The date the SBA commitment was issued.
- ApprovalFY: The fiscal year in which the commitment was approved.
- Term: The loan term in months.
- NoEmp: The number of employees in the business.
- NewExist: Indicates if the business is existing (1) or new (2).
- CreateJob: The number of jobs created by the loan.
- RetainedJob: The number of jobs retained due to the loan.
- FranchiseCode: The franchise code (00000 or 00001 typically signifies no franchise).
- UrbanRural: Denotes the business location as Urban (1), Rural (2), or Undefined (0).
- RevLineCr: Indicates if a revolving line of credit was used (Y = Yes, N = No).
- LowDoc: Indicates participation in the LowDoc Loan Program (Y = Yes, N = No).
- ChgOffDate: The date when a loan was declared to be in default.
- DisbursementDate: The date when the loan amount was disbursed.
- DisbursementGross: The total amount disbursed for the loan.
- BalanceGross: The gross amount outstanding (typically 0 for this dataset, indicating loans are resolved).
- MIS_Status: The loan status, either Charged Off (CHGOFF) or Paid In Full (PIF).
- ChgOffPrinGr: The principal amount charged off.
- GrAppv: The gross amount of the loan approved by the bank.
- SBA_Appv: The SBA's guaranteed amount of the approved loan.
- New: Denotes if it's a New (1) or Existing (0) loan.
- RealEstate: Indicates whether real estate was used as collateral (1 = Yes, 0 = No).
- Portion: The portion of the loan guaranteed by the SBA.
- Recession: Indicates if the loan was made during a recession (1 = Yes, 0 = No).
- daysterm: The total number of days in the loan term.
- xx: Represents the amount of default, if any.
- Default: A binary indicator (1 = Yes, 0 = No) of whether the loan defaulted, transformed into a factor.
Distribution
This dataset is typically provided as a CSV file (SBAcase.11.13.17.csv), with a size of approximately 389.43 kB. It comprises 2,102 observations (rows) and 35 variables (columns), offering a structured collection of small business loan performance records for analysis.
Usage
This dataset is ideal for:
- Developing predictive models to determine the likelihood of a small business loan defaulting.
- Conducting risk assessment for loan portfolios.
- Analysing the effectiveness of government loan guarantee programmes.
- Researching the socio-economic impact of small business lending and job creation.
- Gaining insights into factors influencing loan outcomes in the real estate and rental industry.
Coverage
The dataset's scope is geographically limited to California, specifically focusing on loans within the Real Estate and Rental and Leasing industry. It includes historical data from 1987 through 2014, providing a time series of loan performance over this period.
License
CC0: Public Domain
Who Can Use It
This dataset is suitable for a wide range of users, including:
- Financial analysts and risk managers looking to understand and model loan default probabilities.
- Economists and policy makers interested in the impact of SBA programmes on economic growth and employment.
- Data scientists and machine learning practitioners seeking to build and test predictive models for loan performance.
- Academics and researchers studying small business finance and credit risk.
- Lenders who wish to refine their loan approval processes.
Dataset Name Suggestions
- California Small Business Loan Defaults
- SBA Loan Performance: Real Estate & Leasing (CA)
- US Small Business Credit Risk Dataset
- California SBA Guaranteed Loan Outcomes
- Real Estate Sector Loan Default Analysis
Attributes
Original Data Source: California SBA Guaranteed Loan Outcomes