Opendatabay APP

Diabetes Risk Factor Prediction Records

Patient Health Records & Digital Health

Tags and Keywords

Health

Diabetes

Prediction

Clinical

Risk

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Diabetes Risk Factor Prediction Records Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Clinical data collected from a number of patients, designed to facilitate the prediction of diabetes risk. This resource supports research and the development of machine learning models aimed at detecting diabetes. It includes various clinical features and clear diagnostic labels, making it ideal for building and testing robust prediction tools. The primary focus is on simple feature analysis for effective diabetes risk detection.

Columns

The dataset contains nine columns detailing key patient health metrics:
  • Age: The patient’s age recorded in years. Age is noted as a relevant risk factor for diabetes.
  • Gender: Indicates whether the patient is Male or Female, a factor considered in diabetes prediction.
  • BMI: Body Mass Index, a measurement used to classify whether a person is of normal weight, overweight, or obese.
  • High_BP: An indicator showing if the patient suffers from high blood pressure (1: Yes / 0: No).
  • FBS: The patient's blood glucose level measured after an overnight fasting period (in mg/dL).
  • HbA1c_level: A measurement reflecting the patient’s average blood sugar levels over the preceding two to three months.
  • Smoking: An indicator of whether the patient is a smoker (1: Yes / 0: No).
  • Diagnosis: The key target variable, indicating if the individual has received a diabetes diagnosis (1: Yes / 0: No).

Distribution

The data file is typically found in a CSV format, with the sample size being 2.8 MB. It includes nine columns and consists of approximately 88.4 thousand valid records. Specific numbers for rows or records are available through the provided distribution statistics. The data is not expected to be updated and has an expected update frequency of 'Never'.

Usage

This data product is suited for the development and validation of machine learning algorithms for diabetes risk prediction. It can be used by researchers and modellers to assess the correlation between various clinical factors and a diabetes diagnosis. It is particularly useful for building simple diagnostic tools.

Coverage

The dataset covers various clinical and demographic metrics (Age, Gender, BMI, etc.). While it provides detailed physiological measures, the provided information does not specify the geographic location or the exact time frame over which the patient data was collected.

License

CC0: Public Domain

Who Can Use It

This product is intended for data scientists, machine learning engineers focusing on healthcare applications, and public health researchers. Users can apply the dataset to train diagnostic models, analyse health trends related to chronic conditions, or perform statistical analysis on risk factors associated with diabetes.

Dataset Name Suggestions

  • Diabetes Risk Factor Prediction Records
  • Simple Clinical Diagnosis Dataset
  • Patient Health Metrics for Diabetes Screening
  • Predictive Health Condition Dataset

Attributes

Listing Stats

VIEWS

13

DOWNLOADS

4

LISTED

17/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format