Opendatabay APP

Large-Scale Fungal Morphological and Toxicity Archive

Synthetic Biology & Genetic Engineering

Tags and Keywords

Mushrooms

Classification

Biology

Toxicity

Prediction

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Large-Scale Fungal Morphological and Toxicity Archive Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Distinguishing whether a mushroom is toxic or safe for consumption is a critical task that bridges the gap between biological study and predictive analytics. The records within this collection provide a massive array of fungal characteristics designed for large-scale binary classification. By documenting morphological details such as cap shape, gill attachment, and stem surface, the data offers a quantitative foundation for identifying patterns that separate poisonous specimens from edible ones. This resource is particularly significant for researchers looking to test machine learning algorithms on tabular data that includes real-world artifacts and uncleaned variables.

Columns

  • id: A unique numerical identifier assigned to each specimen in the archive.
  • class: The target variable indicating the mushroom's toxicity, categorised as poisonous (p) or edible (e).
  • cap-diameter: The physical width of the mushroom's cap.
  • cap-shape: The structural form of the cap, including bell (b), conical (c), convex (x), flat (f), knobbed (k), sunken (s), and oval (o).
  • cap-surface: The texture of the cap, described as fibrous (f), grooves (g), scaly (y), smooth (s), or silky (l).
  • cap-color: The pigmentation of the cap, featuring codes for brown (n), buff (b), cinnamon (c), grey (g), green (r), pink (p), purple (u), red (e), white (w), yellow (y), and black (k).
  • does-bruise-or-bleed: A boolean indicator (true/false) showing if the mushroom changes colour or exudes liquid when handled.
  • gill-attachment: The method by which the gills connect to the stem, such as attached (a), descending (d), free (f), or notched (n).
  • gill-spacing: The density of the gill structure, categorised as close (c), crowded (w), or distant (d).
  • gill-color: The specific colour of the gills using standardised letter codes.
  • stem-root: The base structure of the stem, ranging from bulbous (b) and club (c) to rooted (r) and fibrous (f).
  • stem-surface: The external texture of the mushroom stem.
  • stem-color: The observed colour of the stem.
  • veil-type: The classification of the mushroom's veil, such as partial (p) or universal (u).
  • veil-color: The pigmentation of the veil.
  • has-ring: A boolean value indicating the presence of a ring on the stem.
  • ring-type: The specific shape of the ring, including types like pendant (p), evanescent (e), and flaring (f).
  • spore-print-color: The colour of the powder produced by the mushroom's spores.
  • habitat: The natural environment where the specimen was found, such as woods (d), grasses (g), or urban areas (u).
  • season: The time of year of the observation, covering spring (s), summer (u), autumn (a), and winter (w).

Distribution

The data is delivered in a CSV file titled poisonous_mushrooms.csv with a total size of 168.27 MB. It contains approximately 3.12 million valid records structured across 22 columns. While the core variables like class and ID show 100% validity, other fields such as cap surface and gill spacing contain missing entries, reflecting the uncleaned nature of the competition artifacts. This resource is a static archive with no future updates expected.

Usage

This resource is ideal for developing and benchmarking binary prediction models within the field of machine learning. It is well-suited for feature engineering tasks where analysts must determine which physical traits most strongly correlate with toxicity. Additionally, researchers can use these records to study the distribution of fungal species across different habitats and seasons, or to practice data cleaning techniques on a large-scale tabular dataset.

Coverage

The scope of this collection is biological, focusing on the morphological diversity of mushrooms. Temporally, the data includes observations across all four seasons (spring, summer, autumn, and winter). The demographic focus is entirely on fungal specimens, providing a broad look at various species and their physical characteristics without a specific geographic restriction mentioned in the sources.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

Data scientists can leverage these records to train high-accuracy classification models and explore the nuances of large-scale tabular data. Biologists and mycology students may utilise the descriptive features to identify common markers of poisonous fungi. Furthermore, participants in machine learning competitions can find this a valuable primary source for testing new algorithms against a dataset with over 3 million entries.

Dataset Name Suggestions

  • Binary Prediction Dataset for Poisonous Mushrooms
  • Large-Scale Fungal Morphological and Toxicity Archive
  • 3-Million Record Mushroom Classification Registry
  • Biological Attributes and Edibility Metrics of Fungi
  • Mushroom Feature and Poisonous Class Database

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

1

LISTED

31/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format