Large-Scale Fungal Morphological and Toxicity Archive
Synthetic Biology & Genetic Engineering
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Distinguishing whether a mushroom is toxic or safe for consumption is a critical task that bridges the gap between biological study and predictive analytics. The records within this collection provide a massive array of fungal characteristics designed for large-scale binary classification. By documenting morphological details such as cap shape, gill attachment, and stem surface, the data offers a quantitative foundation for identifying patterns that separate poisonous specimens from edible ones. This resource is particularly significant for researchers looking to test machine learning algorithms on tabular data that includes real-world artifacts and uncleaned variables.
Columns
- id: A unique numerical identifier assigned to each specimen in the archive.
- class: The target variable indicating the mushroom's toxicity, categorised as poisonous (p) or edible (e).
- cap-diameter: The physical width of the mushroom's cap.
- cap-shape: The structural form of the cap, including bell (b), conical (c), convex (x), flat (f), knobbed (k), sunken (s), and oval (o).
- cap-surface: The texture of the cap, described as fibrous (f), grooves (g), scaly (y), smooth (s), or silky (l).
- cap-color: The pigmentation of the cap, featuring codes for brown (n), buff (b), cinnamon (c), grey (g), green (r), pink (p), purple (u), red (e), white (w), yellow (y), and black (k).
- does-bruise-or-bleed: A boolean indicator (true/false) showing if the mushroom changes colour or exudes liquid when handled.
- gill-attachment: The method by which the gills connect to the stem, such as attached (a), descending (d), free (f), or notched (n).
- gill-spacing: The density of the gill structure, categorised as close (c), crowded (w), or distant (d).
- gill-color: The specific colour of the gills using standardised letter codes.
- stem-root: The base structure of the stem, ranging from bulbous (b) and club (c) to rooted (r) and fibrous (f).
- stem-surface: The external texture of the mushroom stem.
- stem-color: The observed colour of the stem.
- veil-type: The classification of the mushroom's veil, such as partial (p) or universal (u).
- veil-color: The pigmentation of the veil.
- has-ring: A boolean value indicating the presence of a ring on the stem.
- ring-type: The specific shape of the ring, including types like pendant (p), evanescent (e), and flaring (f).
- spore-print-color: The colour of the powder produced by the mushroom's spores.
- habitat: The natural environment where the specimen was found, such as woods (d), grasses (g), or urban areas (u).
- season: The time of year of the observation, covering spring (s), summer (u), autumn (a), and winter (w).
Distribution
The data is delivered in a CSV file titled
poisonous_mushrooms.csv with a total size of 168.27 MB. It contains approximately 3.12 million valid records structured across 22 columns. While the core variables like class and ID show 100% validity, other fields such as cap surface and gill spacing contain missing entries, reflecting the uncleaned nature of the competition artifacts. This resource is a static archive with no future updates expected.Usage
This resource is ideal for developing and benchmarking binary prediction models within the field of machine learning. It is well-suited for feature engineering tasks where analysts must determine which physical traits most strongly correlate with toxicity. Additionally, researchers can use these records to study the distribution of fungal species across different habitats and seasons, or to practice data cleaning techniques on a large-scale tabular dataset.
Coverage
The scope of this collection is biological, focusing on the morphological diversity of mushrooms. Temporally, the data includes observations across all four seasons (spring, summer, autumn, and winter). The demographic focus is entirely on fungal specimens, providing a broad look at various species and their physical characteristics without a specific geographic restriction mentioned in the sources.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
Data scientists can leverage these records to train high-accuracy classification models and explore the nuances of large-scale tabular data. Biologists and mycology students may utilise the descriptive features to identify common markers of poisonous fungi. Furthermore, participants in machine learning competitions can find this a valuable primary source for testing new algorithms against a dataset with over 3 million entries.
Dataset Name Suggestions
- Binary Prediction Dataset for Poisonous Mushrooms
- Large-Scale Fungal Morphological and Toxicity Archive
- 3-Million Record Mushroom Classification Registry
- Biological Attributes and Edibility Metrics of Fungi
- Mushroom Feature and Poisonous Class Database
Attributes
Original Data Source: Large-Scale Fungal Morphological and Toxicity Archive
Loading...
Free
Download Dataset in CSV Format
Recommended Datasets
Loading recommendations...
