Opendatabay APP

Machine Learning Cannabis Strain Dataset

Data Science and Analytics

Tags and Keywords

Cannabis

Leafly

Strains

Terpenes

Classification

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Machine Learning Cannabis Strain Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

A unique source of cannabis-related information, extracted directly from Leafly's website, is provided to fill a literature gap for properly described and structured data. This data is significant for machine learning applications, such as improving cannabis strain classification, testing traditional classifiers, and exploring how modern computer vision approaches might be beneficial.

Columns

  • name: The name of the cannabis strain.
  • img_url: The URL for the image of the strain.
  • type: The type of cannabis, such as sativa, indica, or hybrid.
  • thc_level: The percentage of THC in the strain.
  • most_common_terpene: The most prevalent terpene found in the strain.
  • description: A textual description of the strain.
  • relaxed: The percentage indicating the relaxing effect of the strain.
  • happy: The percentage indicating the happy effect of the strain.
  • euphoric: The percentage indicating the euphoric effect of the strain.
  • uplifted: The percentage indicating the uplifting effect of the strain.

Distribution

The data is available in both .json and .csv formats. The CSV file, leafly_strain_data.csv, is 2.78 MB in size and contains meta-information for 4,762 unique cannabis strains across 10 columns.

Usage

This data is ideal for machine learning projects, particularly for developing and testing algorithms related to cannabis strain classification. It can be used to analyse the relationships between strain type, THC levels, terpenes, and their reported effects. The image URLs also support computer vision tasks.

Coverage

The data is extracted from the public-facing website Leafly. There is no specific geographic or demographic scope mentioned, as it pertains to cannabis strain information available online. The data has an expected quarterly update frequency. Please note that some fields have missing values; for instance, thc_level is missing for 43% of entries and img_url is missing for 98% of entries.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists and Machine Learning Engineers can use this data to build classification models, perform natural language processing on the descriptions, and apply computer vision techniques to the images.
  • Researchers in computer science or botany can use this data to explore patterns and relationships within cannabis strains, contributing to academic literature.
  • Cannabis Industry Analysts can analyse the data to understand trends in strain characteristics, popular effects, and terpene profiles.

Dataset Name Suggestions

  • Leafly Cannabis Strain Meta-Data
  • Cannabis Strain Profiles from Leafly
  • Leafly Strain Effects and Terpene Data
  • Machine Learning Cannabis Strain Dataset

Attributes

Listing Stats

VIEWS

42

DOWNLOADS

0

LISTED

17/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format