Four Emotion Speech Recognition Dataset
Education & Learning Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This is a high-quality, labeled audio dataset containing collected texts spoken by various individuals expressing four specific emotions: euphoria, joy, sadness, and surprise. Each audio clip captures the vital tone, intonation, and nuances of human speech. The data is pre-organized and labeled by the expressed emotion, making it highly valuable for academic research, developing automated emotion detection systems, and enhancing conversational analytics technologies.
Columns
The supporting data file, typically in CSV format, includes specific details about the audio samples and speakers:
- set_id: Provides a unique identifier and link to the corresponding set of audio files. There are 20 unique values in this column.
- text: The specific text that was spoken in the audio set. There are 16 unique phrases recorded.
- gender: The gender of the person recording the audio. The majority are FEMALE (55%), with the remainder being MALE (45%).
- age: The age of the person. Ages range from 18 to 42, with a mean age of 25.3.
- country: The country of origin for the speaker. This column has 8 unique countries listed, with 'KE' being the most frequently represented.
Distribution
The dataset is structured with audio files contained in folders corresponding to individual speakers, along with a supplementary metadata CSV file (
speech_emotions.csv). The metadata file is small, approximately 2.35 kB, and contains a total of 20 records. All five columns within the metadata file are 100% valid with no missing values.Usage
This data is suitable for numerous advanced applications, including:
- Training machine learning models for automatic emotion detection.
- Developing applications for emotional speech synthesis.
- Enhancing functionality in voice assistants and conversational AI tools.
- Implementing sentiment analysis in customer service platforms.
- Aiding research in mental health analysis and computational paralinguistics.
Coverage
The coverage focuses on English language recordings. Demographic scope is wide, featuring speakers with ages spanning from 18 to 42, representing both male and female genders, and originating from 8 distinct countries. The emotions covered are explicitly limited to euphoria, joy, sadness, and surprise.
License
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Who Can Use It
- AI/ML Developers: For developing algorithms that accurately recognise emotions from audio signals.
- Researchers in Neuroscience and Linguistics: Studying how emotional content is conveyed through human speech features.
- Companies in Customer Service and Voice Technology: Enhancing user experience by understanding and responding appropriately to a customer’s emotional state.
Dataset Name Suggestions
- Emotional Speech Audio Corpus
- Four Emotion Speech Recognition Dataset
- Labeled Emotional Speech Data
- English Voice Emotion Recogniser Set
Attributes
Original Data Source: Four Emotion Speech Recognition Dataset
Loading...
