Opendatabay APP

Female Pima Diabetes Study Dataset

Clinical Trials & Research

Tags and Keywords

Diabetes

Pima

Diagnostic

Health

Prediction

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Female Pima Diabetes Study Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset aims to diagnostically predict diabetes in patients based on various medical measurements. It was originally compiled by the National Institute of Diabetes and Digestive and Kidney Diseases. A specific constraint for this dataset is that all included patients are females of Pima Indian heritage, aged at least 21 years old. While it may not be fully up-to-date, it serves as a valuable resource for practicing data analysis and for students to train their study ideas using a substantive dataset. The dataset includes several independent medical predictor variables and one target dependent variable, 'Outcome'.

Columns

  • Pregnancies: Indicates the number of pregnancies a patient has had.
  • Glucose: Represents the patient's glucose level in blood.
  • BloodPressure: Measures the patient's blood pressure.
  • SkinThickness: Denotes the thickness of the patient's skin.
  • Insulin: Reflects the patient's insulin level in blood.
  • BMI: Shows the patient's Body Mass Index.
  • DiabetesPedigreeFunction: Expresses the diabetes percentage.
  • Age: Indicates the patient's age.
  • Outcome: The dependent variable, signifying the final result of diabetes diagnosis (1 for YES, 0 for NO).

Distribution

The dataset is provided in a CSV file format, specifically named diabetes.csv, with a file size of 23.88 kB. It comprises 9 columns and contains 768 valid records across all variables. The dataset is structured with several independent medical predictor variables and a single target dependent variable. No missing or mismatched data is reported for the listed columns.

Usage

This dataset is ideal for:
  • Practicing data analysis and developing analytical skills.
  • Training students on data concepts with a substantial dataset.
  • Building and testing machine learning models for diabetes prediction.
  • Conducting statistical analysis on factors related to diabetes onset.
  • Researching diagnostic prediction of diabetes.

Coverage

The dataset focuses exclusively on females who are at least 21 years old and of Pima Indian heritage. While the specific collecting years are not provided in the sources, it is noted that the dataset may not be up-to-date. Geographic scope is implicitly tied to the Pima Indian community.

License

CC0: Public Domain

Who Can Use It

  • Students: For educational purposes, data analysis practice, and training.
  • Data Analysts: For exploring health-related data and statistical insights.
  • Machine Learning Engineers: For developing and evaluating predictive models for diabetes.
  • Medical Researchers: For studies on diabetes diagnostics and population health.
  • Academics: For teaching and research projects in health informatics.

Dataset Name Suggestions

  • Pima Indian Diabetes Prediction Data
  • Diabetes Diagnostic for Pima Women
  • Pima Heritage Diabetes Health Metrics
  • Female Pima Diabetes Study Dataset
  • Pima Indian Diabetes Prognosis Data

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

14/07/2025

REGION

ASIA

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format