Medical Insurance Dataset
Healthcare Insurance & Costs
Related Searches
Trusted By




"No reviews yet"
Free
About
This dataset contains detailed demographic and health-related information for individuals alongside their corresponding medical insurance charges. It includes features such as age, sex, BMI, number of children, smoking status, region, and total insurance cost. This dataset is covered from the USA.
The dataset is ideal for building and evaluating machine learning models that predict healthcare costs based on personal and lifestyle factors.
Dataset Features
1. age:
Age of the individual in years.
2. sex:
Biological sex of the individual (male or female).
3. BMI:
Body Mass Index — the numeric measure of body fat based on height and weight.
4. children:
Number of dependent children covered by the insurance plan.
5. smoker:
Smoking status of the individual (yes or no).
6. region:
Geographic region of the individual within the United States (northeast, northwest, southeast, or southwest).
7. charges:
Individual medical insurance cost billed by the insurer.
Distribution
-
Format: CSV (Comma-Separated Values)
-
Data Volume: Rows: 1,338 records
-
7 Columns: age, sex, BMI, children, smoker, region, charges
-
File Size: Approximately 56 KB
Usage
This dataset is ideal for a variety of applications:
Medical Cost Prediction: Train regression models to estimate insurance charges based on demographic and lifestyle factors
Health Economics Research: Analyze how factors like smoking, BMI, and age impact healthcare costs.
Geographic Coverage:
-
United States: the dataset includes individuals from four regions: northeast, northwest, southeast, and southwest.
-
Time Range: The exact dates of data collection are not specified, but the data reflects typical insurance and demographic patterns observed in recent years.
-
Demographics: Includes a diverse range of individuals: Age Range: From 18 to 64 years old Gender: Male and female Lifestyle Factors: Smoking status and BMI Dependents: Number of children covered by the insurance
License
CC0
Who Can Use It
- Data Scientists: For training machine learning models.
- Researchers: For academic or scientific studies.
- Businesses: For analysis, insights, or AI development.