Opendatabay APP

Medical Insurance Dataset

Healthcare Insurance & Costs

Related Searches

Insurance

Medical

Healthcare

AI

Supervised

Learning

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Medical Insurance Dataset  Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset contains detailed demographic and health-related information for individuals alongside their corresponding medical insurance charges. It includes features such as age, sex, BMI, number of children, smoking status, region, and total insurance cost. This dataset is covered from the USA.
The dataset is ideal for building and evaluating machine learning models that predict healthcare costs based on personal and lifestyle factors.

Dataset Features

1. age: Age of the individual in years.
2. sex: Biological sex of the individual (male or female).
3. BMI: Body Mass Index — the numeric measure of body fat based on height and weight.
4. children: Number of dependent children covered by the insurance plan.
5. smoker: Smoking status of the individual (yes or no).
6. region: Geographic region of the individual within the United States (northeast, northwest, southeast, or southwest).
7. charges: Individual medical insurance cost billed by the insurer.

Distribution

  • Format: CSV (Comma-Separated Values)
  • Data Volume: Rows: 1,338 records
  • 7 Columns: age, sex, BMI, children, smoker, region, charges
  • File Size: Approximately 56 KB

Usage

This dataset is ideal for a variety of applications:
Medical Cost Prediction: Train regression models to estimate insurance charges based on demographic and lifestyle factors
Health Economics Research: Analyze how factors like smoking, BMI, and age impact healthcare costs.

Geographic Coverage:

  • United States: the dataset includes individuals from four regions: northeast, northwest, southeast, and southwest.
  • Time Range: The exact dates of data collection are not specified, but the data reflects typical insurance and demographic patterns observed in recent years.
  • Demographics: Includes a diverse range of individuals: Age Range: From 18 to 64 years old Gender: Male and female Lifestyle Factors: Smoking status and BMI Dependents: Number of children covered by the insurance

License

CC0

Who Can Use It

  • Data Scientists: For training machine learning models.
  • Researchers: For academic or scientific studies.
  • Businesses: For analysis, insights, or AI development.

Listing Stats

VIEWS

14

DOWNLOADS

1

LISTED

02/06/2025

REGION

NORTH AMERICA

UDQSSQUALITY

5 / 5

VERSION

1.0

Free