Opendatabay APP

Pima Female Diabetes Health Data

Patient Health Records & Digital Health

Tags and Keywords

Diabetes

Health

Prediction

Medical

Pima

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Pima Female Diabetes Health Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset aims to predict diabetes based on diagnostic measurements. It originates from the National Institute of Diabetes and Digestive and Kidney Diseases and focuses on female patients, aged at least 21 years old, of Pima Indian heritage. It serves as a valuable resource for developing predictive models for diabetes.

Columns

  • Pregnancies: The number of times pregnant.
  • Glucose: Plasma glucose concentration measured 2 hours into an oral glucose tolerance test.
  • BloodPressure: Diastolic blood pressure, recorded in mm Hg.
  • SkinThickness: Triceps skin fold thickness, in mm.
  • Insulin: 2-Hour serum insulin, in mu U/ml.
  • BMI: Body mass index, calculated as weight in kg/(height in m)^2.
  • DiabetesPedigreeFunction: A function that quantifies the likelihood of diabetes based on family history.
  • Age: The patient's age in years.
  • Outcome: The class variable, indicating whether the patient has diabetes (1) or not (0).

Distribution

The dataset is provided in a CSV format, specifically diabetes.csv, with a file size of 23.87 kB. It contains 768 instances (rows) and 9 columns, consisting of 8 diagnostic attributes and one class variable. All attribute values are numeric. No missing values are present in the provided column details. The class distribution shows 500 instances for Outcome 0 (no diabetes) and 268 instances for Outcome 1 (diabetes).

Usage

This dataset is ideal for various applications, including:
  • Developing and evaluating machine learning models for diabetes prediction.
  • Research into diagnostic measurements and their correlation with diabetes.
  • Educational purposes in data science and medical informatics.
  • Projects involving deep learning techniques for health outcome prediction.

Coverage

The dataset's demographic scope is specific to females aged 21 years or older of Pima Indian heritage. The data was received on 9 May 1990. No geographic coverage is explicitly detailed, and the dataset is not expected to be updated.

License

CC0: Public Domain

Who Can Use It

This dataset is suitable for:
  • Data scientists and machine learning engineers building predictive models for health conditions.
  • Medical researchers exploring diabetes risk factors and diagnostic indicators.
  • Students and academics learning about classification algorithms and health analytics.

Dataset Name Suggestions

  • Pima Indian Diabetes Prediction Dataset
  • Diabetes Disease Prediction Dataset
  • Pima Female Diabetes Health Data
  • Indigenous Diabetes Risk Factors

Attributes

Original Data Source: Pima Female Diabetes Health Data

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

08/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format