Chronic Illness Patient Journey Data
Not Specified
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset originates from Flaredown, an application designed to assist patients with chronic autoimmune and invisible illnesses in managing their symptoms by identifying triggers and evaluating the effectiveness of their treatments. Each day, users of the app record their symptom severity, details of treatments and their doses, and any environmental factors such as foods, stress, or allergens they encounter. The data is highly patient-centred, allowing users to create and track their own unique conditions, symptoms, and treatments. Furthermore, users have the flexibility to add new items to the database, such as specific abdominal pain symptoms, treatments, tags, or food entries, highlighting what is most relevant to their individual concerns.
Columns
- user_id: A unique identifier assigned to each user.
- age: The age of the user in years. While the data shows a mean age of 35.1 years, some entries indicate data quality issues at the extremes.
- sex: The gender of the user, primarily categorised as female (81%), male (7%), or other (12%).
- country: The country where the user resides, encompassing 164 unique locations, with the United States being the most common (59%).
- checkin_date: The specific date on which the data was tracked by the user.
- trackable_id: A unique identifier for each distinct "trackable" item, which could be a symptom, treatment, condition, or environmental factor.
- trackable_type: Specifies the classification of the trackable item, such as 'Symptom' (46%), 'Weather' (17%), 'Treatment', 'Condition', 'Food', or 'Tag'.
- trackable_name: The descriptive name of the trackable, which can be a pre-set option or a custom entry added by the user (e.g., 'humidity', 'pressure', 'Abdominal Pain'). There are over 117,000 unique names recorded.
- trackable_value: Represents the recorded value for a trackable. For symptoms and conditions, this is typically a severity rating on a scale of 0 (not active/no symptom) to 4 (extremely active/severe symptom). For treatments, it may be a string describing the dose (e.g., “3 x 5mg”).
- Condition: Refers to an illness or diagnosis, like Rheumatoid Arthritis, also rated on a 0-4 scale for activity. The Harvey Bradshaw Index (HBI) is specifically used for Crohn's disease severity, where a score of 3 or less indicates probable remission and 8-9 or higher signifies severe disease.
- Symptom: A self-explanatory bodily manifestation, rated on a 0-4 severity scale.
- Treatment: Any intervention or medication a patient uses to alleviate symptoms, optionally accompanied by a dose description.
- Tag: A textual representation of an environmental factor not occurring daily, for example, "ate dairy" or "rainy day".
- Food: Food items are derived from the USDA food database, supplemented by user-added entries.
- Weather: Automatically retrieved for the user's postal code, including parameters such as description, precipitation intensity, humidity, pressure, and daily minimum/maximum temperatures.
Distribution
The dataset is provided in a CSV format, with the primary file,
export.csv
, having a size of 686.17 MB. It contains approximately 7.98 million valid records across its key columns. While most columns are complete, some, like age, sex, country, and trackable value, have a small percentage of missing data.Usage
This dataset is ideal for various analytical and machine learning applications. It can be utilised to:
- Determine how specific treatments impact symptoms, identifying positive, negative, or neutral effects.
- Uncover strong correlations between symptoms and treatments.
- Identify more precise subsets within existing diagnoses to better represent symptoms and predict effective treatments.
- Develop models to reliably predict individual or condition-specific flare triggers.
- Build recommendation systems for treatments, akin to Netflix-style suggestions, based on user similarities.
- Quantify a patient's level of disease activity using their reported symptoms.
- Predict which symptom intervention would yield the greatest improvement for a given illness.
- Infer a patient's condition based on their reported symptoms.
- Detect novel interactions between different treatments.
Coverage
The data spans a considerable period from 18 May 2012 to 6 December 2019. It includes users from 164 different countries, with a significant concentration in the United States. Demographically, the user base has a mean age of 35.1 years and is predominantly female (81%), with male and other genders making up the remainder.
License
CC BY-NC-SA 4.0
Who Can Use It
- Medical Researchers: To analyse patient-reported outcomes, understand chronic illness progression, and evaluate treatment effectiveness.
- Data Scientists & Machine Learning Engineers: For building predictive models for disease flares, treatment recommendation systems, and diagnostic tools.
- Healthcare Technology Developers: To enhance existing patient management platforms or create new digital health solutions.
- Public Health Analysts: To gain insights into chronic disease management at a population level and identify widespread environmental triggers.
- Pharmaceutical Industry: To inform drug development, post-market surveillance, and patient support programmes.
Dataset Name Suggestions
- Chronic Illness Patient Journey Data
- Flaredown Disease Management Records
- Patient-Centric Health Tracking Dataset
- Symptoms, Treatments & Triggers Data
- Real-World Chronic Condition Data
Attributes
Original Data Source: Chronic Illness Patient Journey Data