Machine Learning Cannabis Strain Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
A unique source of cannabis-related information, extracted directly from Leafly's website, is provided to fill a literature gap for properly described and structured data. This data is significant for machine learning applications, such as improving cannabis strain classification, testing traditional classifiers, and exploring how modern computer vision approaches might be beneficial.
Columns
- name: The name of the cannabis strain.
- img_url: The URL for the image of the strain.
- type: The type of cannabis, such as sativa, indica, or hybrid.
- thc_level: The percentage of THC in the strain.
- most_common_terpene: The most prevalent terpene found in the strain.
- description: A textual description of the strain.
- relaxed: The percentage indicating the relaxing effect of the strain.
- happy: The percentage indicating the happy effect of the strain.
- euphoric: The percentage indicating the euphoric effect of the strain.
- uplifted: The percentage indicating the uplifting effect of the strain.
Distribution
The data is available in both
.json and .csv formats. The CSV file, leafly_strain_data.csv, is 2.78 MB in size and contains meta-information for 4,762 unique cannabis strains across 10 columns.Usage
This data is ideal for machine learning projects, particularly for developing and testing algorithms related to cannabis strain classification. It can be used to analyse the relationships between strain type, THC levels, terpenes, and their reported effects. The image URLs also support computer vision tasks.
Coverage
The data is extracted from the public-facing website Leafly. There is no specific geographic or demographic scope mentioned, as it pertains to cannabis strain information available online. The data has an expected quarterly update frequency. Please note that some fields have missing values; for instance,
thc_level is missing for 43% of entries and img_url is missing for 98% of entries.License
CC0: Public Domain
Who Can Use It
- Data Scientists and Machine Learning Engineers can use this data to build classification models, perform natural language processing on the descriptions, and apply computer vision techniques to the images.
- Researchers in computer science or botany can use this data to explore patterns and relationships within cannabis strains, contributing to academic literature.
- Cannabis Industry Analysts can analyse the data to understand trends in strain characteristics, popular effects, and terpene profiles.
Dataset Name Suggestions
- Leafly Cannabis Strain Meta-Data
- Cannabis Strain Profiles from Leafly
- Leafly Strain Effects and Terpene Data
- Machine Learning Cannabis Strain Dataset
Attributes
Original Data Source: Machine Learning Cannabis Strain Dataset
Loading...
