Clinical Diabetes Status Dataset
Patient Health Records & Digital Health
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
A multi-faceted dataset for understanding and predicting diabetes, containing various biomedical measurements and patient characteristics. This dataset is designed for classification tasks, enabling the development of models to distinguish between non-diabetic, prediabetic, and diabetic individuals based on health metrics. It provides valuable insights for healthcare analytics, medical research, and predictive modelling.
Columns
- ID: A unique identifier for each record in the dataset.
- No_Pation: Another patient identifier, possibly a patient number or record ID.
- Gender: The gender of the patient (e.g., F for Female, M for Male).
- AGE: The age of the patient in years.
- Urea: The level of urea in the blood, which can indicate kidney function.
- Cr: The level of creatinine in the blood, another indicator of kidney function.
- HbA1c: Glycated haemoglobin, representing average blood sugar levels over the past 2-3 months.
- Chol: Total cholesterol level in the blood.
- TG: Triglycerides level, a type of fat found in the blood.
- HDL: High-density lipoprotein cholesterol level, often called "good" cholesterol.
- LDL: Low-density lipoprotein cholesterol level, often called "bad" cholesterol.
- VLDL: Very low-density lipoprotein cholesterol level.
- BMI: Body Mass Index, a measure of body fat based on height and weight.
- CLASS: The classification label indicating the patient's diabetes status: 'N' for Non-diabetic, 'P' for Prediabetic, and 'Y' for Diabetic.
Distribution
The dataset is available in a CSV file format named
Dataset of Diabetes .csv
. It contains 1000 records (rows) and 14 columns, with no missing values. The total size of the file is approximately 49.51 kB.Usage
Ideal applications for this dataset include:
- Developing and training machine learning models for multiclass classification to predict diabetes.
- Conducting exploratory data analysis to identify key biomedical markers associated with diabetes.
- Educational purposes for teaching data science and healthcare analytics.
- Medical research into the relationships between different health metrics and diabetes progression.
Coverage
The dataset's specific geographic, time range, and demographic scope are not detailed. However, it includes patient ages ranging from 20 to 79 years and covers both male and female genders. The data is expected to be updated annually.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
- Data Scientists and Machine Learning Engineers: To build and benchmark predictive models for health outcomes.
- Healthcare Researchers: To analyse patient data and uncover correlations between biomarkers and diabetes.
- Students and Academics: As a practical dataset for projects in statistics, machine learning, and public health courses.
- Healthcare Providers and Analysts: To understand patient populations and risk factors for diabetes.
Dataset Name Suggestions
- Diabetes Prediction Health Indicators
- Biomedical Markers for Diabetes Classification
- Patient Health Metrics for Diabetes Prediction
- Clinical Diabetes Status Dataset
Attributes
Original Data Source: Clinical Diabetes Status Dataset