Opendatabay APP

Diabetes Research Patient Profiles

Patient Health Records & Digital Health

Tags and Keywords

Diabetes

Health

Patients

Medical

Research

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Diabetes Research Patient Profiles Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset offers detailed health information pertinent to diabetes research, containing data for 1,879 patients. Each patient is identified by a unique ID ranging from 6000 to 7878, and is associated with a confidential doctor to ensure privacy. The dataset encompasses a wide array of health parameters, including demographic details, lifestyle factors, medical history, clinical measurements, medication usage, symptoms, quality of life scores, environmental exposures, and health behaviours. This valuable resource is particularly well-suited for researchers and data scientists aiming to explore factors related to diabetes, develop predictive models, and conduct various statistical analyses. It is a synthetic dataset, originally generated for educational purposes, making it ideal for data science and machine learning projects. The dataset is an original creation, owned by Mr. Rabie El Kharoua, and has not been previously shared.

Columns

The dataset includes the following columns, each providing crucial health and demographic information:
  • PatientID: A unique identifier assigned to each patient, ranging from 6000 to 7878.
  • Age: The age of the patients, with values from 20 to 90 years.
  • Gender: Patient's gender, where 0 denotes Male and 1 denotes Female.
  • Ethnicity: Patient's ethnicity, coded as: 0 (Caucasian), 1 (African American), 2 (Asian), 3 (Other).
  • SocioeconomicStatus: Patient's socioeconomic status, coded as: 0 (Low), 1 (Middle), 2 (High).
  • EducationLevel: Patient's education level, coded as: 0 (None), 1 (High School), 2 (Bachelor's), 3 (Higher).
  • BMI: Body Mass Index of patients, ranging from 15 to 40.
  • Smoking: Smoking status, where 0 indicates No and 1 indicates Yes.
  • AlcoholConsumption: Weekly alcohol consumption in units, ranging from 0 to 20.
  • PhysicalActivity: Weekly physical activity in hours, ranging from 0 to 10.
  • DietQuality: Diet quality score, ranging from 0 to 10.
  • SleepQuality: Sleep quality score, ranging from 4 to 10.
  • FamilyHistoryDiabetes: Family history of diabetes, where 0 indicates No and 1 indicates Yes.
  • GestationalDiabetes: History of gestational diabetes, where 0 indicates No and 1 indicates Yes.
  • PolycysticOvarySyndrome: Presence of polycystic ovary syndrome, where 0 indicates No and 1 indicates Yes.
  • PreviousPreDiabetes: History of previous pre-diabetes, where 0 indicates No and 1 indicates Yes.
  • Hypertension: Presence of hypertension, where 0 indicates No and 1 indicates Yes.
  • SystolicBP: Systolic blood pressure, ranging from 90 to 180 mmHg.
  • DiastolicBP: Diastolic blood pressure, ranging from 60 to 120 mmHg.
  • FastingBloodSugar: Fasting blood sugar levels, ranging from 70 to 200 mg/dL.
  • HbA1c: Hemoglobin A1c levels, ranging from 4.0% to 10.0%.
  • SerumCreatinine: Serum creatinine levels, ranging from 0.5 to 5.0 mg/dL.
  • BUNLevels: Blood Urea Nitrogen levels, ranging from 5 to 50 mg/dL.
  • CholesterolTotal: Total cholesterol levels, ranging from 150 to 300 mg/dL.
  • CholesterolLDL: Low-density lipoprotein cholesterol levels, ranging from 50 to 200 mg/dL.
  • CholesterolHDL: High-density lipoprotein cholesterol levels, ranging from 20 to 100 mg/dL.
  • CholesterolTriglycerides: Triglycerides levels, ranging from 50 to 400 mg/dL.
  • AntihypertensiveMedications: Use of antihypertensive medications, where 0 indicates No and 1 indicates Yes.
  • Statins: Use of statins, where 0 indicates No and 1 indicates Yes.
  • AntidiabeticMedications: Use of antidiabetic medications, where 0 indicates No and 1 indicates Yes.
  • FrequentUrination: Presence of frequent urination, where 0 indicates No and 1 indicates Yes.
  • ExcessiveThirst: Presence of excessive thirst, where 0 indicates No and 1 indicates Yes.
  • UnexplainedWeightLoss: Presence of unexplained weight loss, where 0 indicates No and 1 indicates Yes.
  • FatigueLevels: Fatigue levels, ranging from 0 to 10.
  • BlurredVision: Presence of blurred vision, where 0 indicates No and 1 indicates Yes.
  • SlowHealingSores: Presence of slow-healing sores, where 0 indicates No and 1 indicates Yes.
  • TinglingHandsFeet: Presence of tingling in hands or feet, where 0 indicates No and 1 indicates Yes.
  • QualityOfLifeScore: Quality of life score, ranging from 0 to 100.
  • HeavyMetalsExposure: Exposure to heavy metals, where 0 indicates No and 1 indicates Yes.
  • OccupationalExposureChemicals: Occupational exposure to harmful chemicals, where 0 indicates No and 1 indicates Yes.
  • WaterQuality: Quality of water, where 0 indicates Good and 1 indicates Poor.
  • MedicalCheckupsFrequency: Frequency of medical check-ups per year, ranging from 0 to 4.
  • MedicationAdherence: Medication adherence score, ranging from 0 to 10.
  • HealthLiteracy: Health literacy score, ranging from 0 to 10.
  • Diagnosis: Diagnosis status for Diabetes (Target Variable), where 0 indicates No and 1 indicates Yes.
  • DoctorInCharge: Associated doctor for the patient, noted as Confidential.

Distribution

This dataset is provided as a CSV file, specifically named diabetes_data.csv. The file size is 760.54 kB. It contains 1,879 records (patients) and consists of 46 columns of detailed health information. The patient IDs are sequential, ranging from 6000 to 7878. The data is structured in a tabular format.

Usage

This dataset is ideally suited for a variety of applications and use cases, including:
  • Exploring factors associated with diabetes and its progression.
  • Developing predictive models to forecast diabetes diagnosis or risk.
  • Conducting statistical analyses to identify significant correlations and trends in health data.
  • Serving as a valuable resource for data science and machine learning projects focused on health outcomes.
  • Gaining insights into demographic, lifestyle, medical, and environmental variables related to diabetes.

Coverage

The dataset covers patients with ages ranging from 20 to 90 years. Demographic details include gender (Male/Female), ethnicity (Caucasian, African American, Asian, Other), socioeconomic status (Low, Middle, High), and education level (None, High School, Bachelor's, Higher). The data does not specify any particular geographic region or time range.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

This dataset is primarily intended for:
  • Researchers studying diabetes and related health conditions.
  • Data scientists interested in medical data analysis and predictive modelling.
  • Machine learning engineers developing algorithms for health diagnostics or risk assessment.
  • Students and educators for academic projects and learning purposes in data science and healthcare analytics.

Dataset Name Suggestions

  • Diabetes Patient Health Records
  • Healthcare Diabetes Factors Data
  • Patient Diabetes Analytics Dataset
  • Synthetic Diabetes Outcome Data
  • Diabetes Research Patient Profiles

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

03/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format