Diabetes Diagnostic Measures
Patient Health Records & Digital Health
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset aims to diagnostically predict whether a patient has diabetes based on specific diagnostic measurements. Originally sourced from the National Institute of Diabetes and Digestive and Kidney Diseases, the data focuses exclusively on females of Pima Indian heritage aged at least 21 years old. The primary objective is to facilitate diabetes prediction and analysis, offering insights into key medical predictor variables.
Columns
- Pregnancies: Represents the number of pregnancies a patient has had.
- Glucose: Indicates the glucose level in the blood.
- BloodPressure: Measures the blood pressure.
- SkinThickness: Describes the thickness of the skin.
- Insulin: Reflects the insulin level in the blood.
- BMI: Denotes the Body Mass Index.
- DiabetesPedigreeFunction: Expresses the diabetes percentage, providing a function which scores likelihood of diabetes based on family history.
- Age: Indicates the patient's age.
- Outcome: The target variable, where '1' signifies a positive diabetes diagnosis and '0' signifies no diabetes.
Distribution
The dataset is provided in a CSV file format (
diabetes.csv
), with a size of 23.88 kB. It contains 9 columns and consists of 768 valid records across all variables. The structure includes several independent medical predictor variables and a single dependent target variable, 'Outcome'. There are no missing or mismatched values reported within the dataset.Usage
This dataset is ideal for developing and evaluating predictive models for diabetes diagnosis. It can be utilised for exploratory data analysis to understand the relationships between medical indicators and diabetes outcomes. Researchers and data scientists can apply classification algorithms to forecast diabetes based on the provided diagnostic measurements.
Coverage
The dataset's scope is specific to females of Pima Indian heritage, with all individuals being at least 21 years old. It does not specify a particular geographic region or time range for data collection. The selection criteria were constrained to ensure this specific demographic focus from a larger database.
License
CC0: Public Domain
Who Can Use It
This dataset is valuable for individuals and organisations engaged in medical research, public health analysis, and the development of healthcare diagnostic tools. Data scientists and machine learning practitioners can use it to build and test predictive models for disease detection. Students and educators in fields like bioinformatics, statistics, and artificial intelligence would also find it useful for learning and practical application.
Dataset Name Suggestions
- Pima Indian Diabetes Prediction Dataset
- Diabetes Diagnostic Measures
- Pima Female Diabetes Health Data
- Predictive Diabetes Analytics
- Medical Diabetes Indicators
Attributes
Original Data Source:Diabetes Diagnostic Measures