Music Emotion Dimensions
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This resource contains sentiment information for 90,001 individual songs. The emotional data for each song is established through the analysis of social tags collected from Last.fm, utilizing the Warriner et al. database. The sentiment conveyed is explicitly modelled across three critical dimensions: valence, which quantifies the pleasantness of the stimulus; arousal, which measures the intensity of the emotion invoked; and dominance, reflecting the degree of control exerted by the stimulus. This dataset acts as a proof-of-concept for ongoing work in affective distant hearing research.
Columns
The dataset features 11 columns, including identifiers and sentiment metrics:
- lastfm_url: The dedicated Last.fm page address for the specific song.
- track: The title of the song (approximately 79,392 unique values).
- artist: The name of the performing artist (approximately 26,012 unique values).
- seeds: The initial keyword or keywords used to scrape and gather data for the song.
- number_of_emotion_tags: A count of the distinct words that factored into calculating the song's final emotion score.
- valence_tags: The calculated score for the pleasantness dimension (Mean: 5.45).
- arousal_tags: The calculated score for the intensity dimension (Mean: 4.32).
- dominance_tags: The calculated score for the control dimension (Mean: 5.25).
- mbid: The MusicBrainz Identifier for the song. Note that approximately 32% of values are currently missing.
- spotify_id: The Spotify Identifier for the song. Note that approximately 32% of values are currently missing.
- genre: The genre of the song, inferred by comparing Last.fm social tags against a static list of music genres. Approximately 7% of values are missing.
Distribution
The dataset is structured as a tabular file, specifically named
muse_v3.csv, and has a size of 18.2 MB. It contains sentiment scores for 90,001 unique songs. There are 11 columns of data available. A known issue includes a fair amount of duplicate records, which arises when different songs share the same initial seed tag due to insufficient data on Last.fm. This situation does not imply identical sentiment but highlights an area for data collection refinement.Usage
This data is ideally suited for academic research in computational humanities and affective computing. Potential use cases include:
- Developing models for Affective Distant Hearing.
- Experimenting with text analysis techniques applied to crowdsourced sentiment data.
- Inferring and analysing musical genre based on social tagging patterns.
- Enhancing data records by linking the provided MusicBrainz and Spotify Identifiers to external databases to retrieve additional metadata.
Coverage
The scope covers over 90,000 songs and includes data derived from the social tagging behaviour of Last.fm users. The sentiment estimation is applied across a large number of unique tracks and artists. While specific geographical or time ranges are not explicitly detailed, the data focuses on capturing emotion conveyed via text analysis of crowdsourced information.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
This resource is valuable for researchers, academics, and data scientists. Specifically, users involved in musicology, digital humanities, psychology of music, and machine learning model development focused on emotion detection will find this data relevant. It can be used for exploratory data analysis of large-scale cultural trends and sentiment distribution across musical tracks.
Dataset Name Suggestions
- MuSe Dataset
- Musical Sentiment Tags
- Affective Song Metrics
- Music Emotion Dimensions
- Last.fm Song Sentiment
Attributes
Original Data Source: Music Emotion Dimensions
Loading...
