Druggie Patient Review Dataset
Health Information Systems & Technology
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides valuable insights for pharmaceutical agencies to track the effectiveness and sales of their drugs. It is a tabular dataset that has been collected over several years, containing key data points such as drug names, patient reviews, the drug's popularity, and its specific use cases. The primary purpose of this dataset is to facilitate the prediction of a drug's base score in various scenarios, supporting drug effectiveness analysis and market understanding.
Columns
- patient_id: A unique identifier for each patient.
- name_of_drug: The specific name of the pharmaceutical drug.
- use_case_for_drug: The disease or condition that the drug is intended to treat.
- review_by_patient: Detailed feedback or comments provided by the patient about the drug.
- effectiveness_rating: A numerical rating indicating how effective the drug is, typically on a scale from 1 to 10.
- drug_approved_by_UIC: The date on which the drug received approval from UIC.
- number_of_times_prescribed: The frequency or count of how many times the drug has been prescribed.
- base_score: A generated score that serves as the target variable for predictions within the dataset.
Distribution
The dataset is presented in a tabular format, typically provided as a CSV file. It comprises approximately 32,000 records.
Key distributions within the dataset include:
- Drug Names: Notable drugs like Levonorgestrel and Etonogestrel each account for 2% of the data, with 96% categorised as 'Other' (representing 30,813 entries).
- Use Cases: Birth Control is a use case for 18% of the data, Depression for 6%, and 76% are 'Other' (representing 24,579 entries).
- Effectiveness Rating: Ratings between 9.55 and 10.00 are the most frequent, with over 10,000 occurrences. Ratings between 1.00 and 1.45 also show significant counts, exceeding 4,200.
- Number of Times Prescribed: A large proportion of drugs have been prescribed between 0 and 38.55 times (over 24,000 instances), with fewer instances for higher prescription counts.
- Base Score: The target variable has unique values up to 232,289.00.
Usage
This dataset is ideally suited for:
- Predicting the base score of a specific drug to assess its overall performance and impact.
- Analysing drug effectiveness and patient feedback.
- Tracking drug popularity and market penetration for pharmaceutical agencies.
- Developing machine learning models for drug performance prediction.
Coverage
The dataset offers a global geographic scope. The time range for the data collection spans from 24th February 2008 to 12th December 2017.
License
CCO
Who Can Use It
This dataset is particularly beneficial for:
- Pharmaceutical agencies for internal drug performance tracking and strategic planning.
- Data scientists and machine learning engineers building predictive models related to drug effectiveness and patient outcomes.
- Market analysts in the healthcare sector to understand drug popularity and use case trends.
- Researchers interested in patient review analysis and drug efficacy studies.
Dataset Name Suggestions
- Pharmaceutical Drug Effectiveness Dataset
- Patient Drug Review and Performance Data
- Drug Efficacy and Prescription Trends
- Healthcare Drug Outcomes Data
- Druggie Patient Review Dataset
Attributes
Original Data Source: 💊🩺Druggie⚕️💊