Opendatabay APP

Stroke Risk Prediction Metrics

Patient Health Records & Digital Health

Tags and Keywords

Health

Stroke

Prediction

Risk

Cardiovascular

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Stroke Risk Prediction Metrics Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

A collection of patient health metrics and lifestyle information designed to assess and predict stroke occurrence. The data encapsulates key medical conditions, biometric indicators, and demographic details essential for healthcare analytics and medical research. This resource is highly suitable for building and testing predictive machine learning models aimed at identifying risk factors and improving cardiovascular health diagnostics.

Columns

The dataset contains 10 attributes detailing patient health status:
  • Age: The patient’s age, provided as a numeric value, with recorded ages ranging from approximately 27 to 99.
  • Gender: The patient's biological sex, categorized as Male or Female, with equal representation in the current sample.
  • SES (Socioeconomic Status): Categorical data indicating Socioeconomic Status (Low, Medium, High). Medium status is the most frequently observed category.
  • Hypertension: A binary indicator (1 or 0) showing if the patient has been diagnosed with hypertension. Approximately 61% of records indicate a positive diagnosis.
  • Heart_Disease: A binary indicator (1 or 0) showing if the patient has a pre-existing heart disease.
  • BMI (Body Mass Index): A numeric measurement of Body Mass Index, spanning a range from roughly 15 to 47.5, with a mean of 28.
  • Avg_Glucose (Average Glucose Level): A numeric representation of the average glucose level, ranging from approximately 45 to 176.
  • Diabetes: A binary indicator (1 or 0) noting whether the patient has diabetes.
  • Smoking_Status: Categorical data indicating smoking habits (Never, Former, Current). The majority of patients report never having smoked.
  • Stroke: The target variable, a binary indicator (1 or 0) showing if the patient has had a stroke.

Distribution

The data is structured as a single CSV file named stroke_data.csv, which totals 813.6 kB in size. The dataset consists of 10,000 valid records across 10 columns. All records are validated, with no missing or mismatched data points reported in the current file.

Usage

This data is ideally applied in several areas, including:
  • Training machine learning models for accurate stroke prediction.
  • Performing exploratory data analysis (EDA) focused on cardiovascular health metrics.
  • Identifying subtle risk factors that contribute to stroke occurrence.
  • Conducting correlation analysis between specific lifestyle indicators (like smoking status and SES) and biometric health data.
  • Developing and testing classification models suitable for healthcare applications.

Coverage

The dataset documents health metrics across a wide age span (27 to 99), balanced gender representation, and various socioeconomic classifications. The data includes details on common health metrics such as glucose, BMI, and the presence of chronic diseases (hypertension, heart disease, diabetes). Specific geographic or time range information regarding the collection of these patient records is not specified.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Data Scientists and ML Engineers: For developing and refining high-performance algorithms for medical diagnostics and risk assessment.
  • Public Health Researchers: To study population-level patterns in stroke risk and understand the influence of lifestyle on health outcomes.
  • Clinical Informaticists: For integrating predictive risk models into existing clinical decision support systems.

Dataset Name Suggestions

  • Stroke Risk Prediction Metrics
  • Patient Cardiovascular Health Factors
  • Biometric Data for Stroke Analytics
  • Health Metrics and Lifestyle Dataset

Attributes

Original Data Source: Stroke Risk Prediction Metrics

Listing Stats

VIEWS

5

DOWNLOADS

2

LISTED

23/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format