Opendatabay APP

Simulated Mushroom Edibility Prediction Data

Synthetic Data Generation

Tags and Keywords

Mushroom

Edibility

Classification

Fungi

Simulation

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Simulated Mushroom Edibility Prediction Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This resource contains a simulation designed specifically for binary classification problems involving fungi. It includes 61,069 hypothetical records, generated to categorise simulated mushrooms as either definitely edible or definitely poisonous/not recommended. The data was created using a Python module that applies randomization to both nominal and metrical variables, expanding upon features derived from a smaller, primary mushroom dataset. This simulation is useful for training and testing machine learning models focused on predicting edibility.

Columns

The dataset features 23 distinct attributes detailing the physical characteristics of the simulated fungi. These columns cover essential morphological traits used in classification:
  • family: The taxonomic family of the mushroom.
  • name: The specific name of the fungus.
  • class: The target variable, indicating edibility (edible or poisonous).
  • cap-diameter, cap-shape, cap-surface, cap-color: Metrics and descriptions related to the mushroom's cap structure.
  • does-bruise-or-bleed: An indicator of physical reaction.
  • gill-attachment, gill-spacing, gill-color: Variables describing the gills.
  • stem-height, stem-width, stem-root, stem-surface, stem-color: Detailed attributes of the stem.
  • veil-type, veil-color, has-ring, ring-type, Spore-print-color, habitat, season: Other key features often used in fungal identification.

Distribution

This collection consists of 61,069 records of simulated mushrooms, generated based on 173 species, with 353 mushrooms represented per species. The files are provided in CSV format. Two versions are available: one ordered by species (secondary_data_generated.csv) and one randomly shuffled (secondary_data_shuffled.csv). This dataset is static and has an expected update frequency of never.

Usage

Ideal applications for this data include:
  • Developing and evaluating machine learning algorithms for binary classification, particularly for toxicity prediction.
  • Studying the outcomes of using randomized nominal and metrical variables in data science projects.
  • Educational training on data generation techniques and the process of expanding primary datasets.
  • Exploratory data analysis focused on correlations between fungal morphology and edibility.

Coverage

The scope of this dataset is purely hypothetical, focusing on simulated characteristics derived from 173 real mushroom species. Since the data is artificially generated via randomization, it does not possess real-world geographical or temporal restrictions. It encompasses the structural variety necessary for a robust classification exercise.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Machine Learning Practitioners: For training robust classification models without needing large, verified real-world samples.
  • Students and Educators: For practical demonstrations of data simulation, feature randomization, and classification project workflow.
  • Researchers: To explore how structural attributes correlate with edibility in a controlled, hypothetical environment.

Dataset Name Suggestions

  • Simulated Mushroom Edibility Prediction Data
  • Fungal Classification Data (Secondary Simulation)
  • Hypothetical Fungi Edibility Schema

Attributes

Listing Stats

VIEWS

4

DOWNLOADS

2

LISTED

26/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format