Medical Cost Personal Datasets
Public Health & Epidemiology
Related Searches
Trusted By




"No reviews yet"
Free
About
Contains medical cost data associated with personal health insurance, incorporating demographic and health-related factors, including age, sex, BMI, smoking status, and geographic region. The dataset is well-suited for regression analysis, enabling users to predict individual medical costs billed by health insurance providers.
Features:
- Age: Age of the primary insurance holder (numeric).
- Sex: Gender of the insurance holder (categorical: 'male', 'female').
- BMI: Body Mass Index, a measure of body fat based on height and weight (numeric).
- Children: Number of dependents covered by the insurance (numeric).
- Smoker: Smoking status of the insurance holder (categorical: 'yes', 'no').
- Region: Residential region of the insurance holder in the US (categorical: 'northeast', 'southeast', 'southwest', 'northwest').
- Charges: Medical insurance costs billed to the insurance holder (numeric).
Usage:
This dataset can be used for:
- Predictive modeling of medical insurance costs using demographic and lifestyle factors.
- Regression analysis to understand the impact of factors like age, BMI, and smoking on insurance charges.
- Training machine learning algorithms to assist insurance providers in estimating premiums.
Coverage:
The dataset includes medical cost data from individuals across the United States, with a focus on attributes that influence health insurance costs.
License:
CC0 (Public Domain)
Who can use it:
Data scientists, health insurance providers, actuaries, and researchers interested in predictive modeling of insurance costs.
How to use it:
- Use regression models, such as Linear Regression, to predict insurance costs.
- Analyze the effect of variables like age, BMI, and smoking status on insurance charges.
- Assist insurance companies in personalizing premiums based on demographic and health factors.