Synthetic Gastrointestinal Disease Patient Records Dataset
Patient Health Records & Digital Health
Related Searches
Trusted By




"No reviews yet"
£19.99
About
The Synthetic Gastrointestinal Disease Dataset has been generated to support research, model development, and education related to gastrointestinal (GI) health. This comprehensive dataset captures a wide range of patient features, lifestyle factors, test results, symptoms, and clinical diagnoses to simulate real-world diagnostic complexity.
Dataset Features
- Age: Age of the patient in years.
- Gender: Biological sex of the patient (M/F).
- BMI: Body Mass Index.
- Body_Weight: Patient's weight in kilograms.
- Obesity_Status: Categorized as Normal, Overweight, or Obese based on BMI.
- Ethnicity: Ethnic background (e.g., White, Hispanic, Asian, etc.).
- Family_History: Indicates presence of family history of GI conditions (Yes/No).
- Genetic_Markers: Count of relevant genetic risk markers detected.
- Microbiome_Index: Numerical score representing gut microbiota diversity or imbalance.
- Autoimmune_Disorders: Presence of autoimmune conditions (Yes/No).
- H_Pylori_Status: Helicobacter pylori infection status (Yes/No).
- Fecal_Calprotectin: Inflammatory marker measured in stool (numeric count).
- Occult_Blood_Test: Result of hidden blood detection in stool (Positive/Negative).
- CRP_ESR: Combined C-Reactive Protein / Erythrocyte Sedimentation Rate value, an inflammation marker.
- Endoscopy_Result / Colonoscopy_Result / Stool_Culture: Clinical test results (e.g., Normal, Abnormal).
- Diet_Type: Type of diet followed (e.g., Vegetarian, Western, etc.).
- Food_Intolerance: Reported intolerances (Yes/No).
- Smoking_Status / Alcohol_Use / Physical_Activity: Lifestyle habits.
- Stress_Level: Reported level of psychological stress (Low/Moderate/High). Note: Some entries missing.
- GI Symptoms: Includes:
- Abdominal_Pain, Bloating, Diarrhea, Constipation
- Rectal_Bleeding, Appetite_Loss, Weight_Loss
- Bowel_Habits: Overall pattern (e.g., Normal, Frequent, Irregular).
- Bowel_Movement_Frequency: Number of bowel movements per week.
- Medication Use: Includes:
- NSAID_Use (e.g., ibuprofen), Antibiotic_Use, PPI_Use (proton-pump inhibitors), Medications (Yes/No)
- Disease_Class: Primary GI-related condition diagnosed (e.g., Blood in stool, Nausea or vomiting, Abdominal cramps or pain, Unexplained weight loss).
Distribution

Usage
This dataset is ideal for:
- Disease Classification: Predict GI disease categories using symptoms and clinical test results.
- Feature Importance Analysis: Understand contributing factors in diagnosis.
- Pattern Mining: Detect associations among lifestyle, symptoms, and microbiome/genetic indicators.
- Model Training: Useful for supervised learning (e.g., random forest, XGBoost) or unsupervised clustering.
Coverage
The data integrates symptoms, lifestyle, inflammation markers, test outcomes, and genetics—making it valuable for both biological and behavioral models of disease. It reflects realistic distributions of obesity, diet, and ethnicity found in contemporary populations.
License
CC0 (Public Domain)
Who Can Use It
- Medical Researchers and GI Specialists: For testing diagnostic hypotheses and exploring symptom clusters.
- Data Scientists and ML Engineers: For building diagnostic classifiers or recommender systems.
- Educators and Students: For practical exercises in predictive modeling and health analytics.