BirdCLEF Unified Training Metadata 2021-2023
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Unified training metadata aggregating labels from the Kaggle BirdCLEF 2021, 2022, and 2023 competitions provides a consolidated resource for avian bioacoustics research. This collection facilitates the development of machine learning models for identifying bird species by sound, containing cleaned training labels and file paths. It addresses data redundancy by excluding 6,686 duplicates that overlapped between competitions and removing ambiguous samples found in multiple class folders.
Columns
- primary_label: The code representing the primary bird species identified in the recording (e.g., 'houspa').
- secondary_labels: A list of codes for any background bird species audible in the recording.
- type: The classification of the sound, such as 'song' or 'call'.
- filename: The unique identifier for the audio file (e.g., XC316684.ogg).
- filepath: The relative location path to the audio file within the directory structure.
Distribution
- Format: CSV (train_21_22_23.csv)
- Size: 9.64 MB
- Rows: Approximately 87,900 records
- Structure: 5 columns
Usage
Ideal for training and testing audio classification models, specifically in the domain of ornithology and wildlife monitoring. The dataset supports applications in:
- Bioacoustics research
- Automated bird identification systems
- Educational tools for biology
- Environmental monitoring analysis
Coverage
- Geographic/Taxonomic Scope: Covers 768 unique primary bird species labels.
- Time Range: Aggregates data from the 2021, 2022, and 2023 competition cycles.
- Demographic/Data Notes: 'houspa' is the most common primary label (1%). The 'type' field includes songs (31%) and calls (24%).
License
CC BY-NC-SA 4.0
Who Can Use It
- Data Scientists and Machine Learning Engineers
- Ornithologists and Biologists
- Conservationists
- Audio Signal Processing Researchers
- Educators in Life Sciences
Dataset Name Suggestions
- BirdCLEF Unified Training Metadata 2021-2023
- Consolidated Avian Bioacoustics Labels
- Kaggle BirdCLEF 3-Year Aggregate Metadata
- Cleaned Bird Sound Classification Index
Attributes
Original Data Source: BirdCLEF Unified Training Metadata 2021-2023
Loading...
