Song Information and Lyrics Dataset
News & Media Articles
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset aims to provide a valuable collection of music information and lyrics for study and development. It serves as a useful resource for a variety of applications, including music analysis, natural language processing, sentiment analysis, and recommendation systems. By combining song details and lyrics, it helps academics, developers, and music enthusiasts to examine and analyse the relationship between listener preferences and lyrical content. The overall goal is to offer a rich resource for exploring the intricate relationships between song features, lyrics, and various applications within the music domain.
Columns
The music dataset contains approximately 799 songs, each with its own set of characteristics:
- Name: The title of the song. There are 659 unique song titles out of 799 valid entries.
- Lyrics: The lyrics of the song. There are 615 unique lyric entries out of 799 valid entries. Note that "Lyrics not found" accounts for 3% of entries. Lyrics were obtained from publicly accessible services like Spotify and Soundcloud and converted from audio to text using speech recognition algorithms. While efforts were made to ensure accuracy, some inaccuracies or missing lyrics may be present due to limitations of data sources and algorithms.
- Singer/Artist: The name of the singer or artist who performed the song. There are 436 unique artists out of 799 valid entries, with Eminem being the most frequent.
- Movie/Album: The movie or album associated with the song (if applicable). There are 590 unique album names out of 799 valid entries.
- Genre: The genre or genres to which the song belongs.
- Rating/Popularity: The popularity score of the song from Spotify. Scores range from 0 to 100, with a mean of 68.8 and a standard deviation of 22.5 across 799 valid entries.
Distribution
The dataset is typically provided as a data file in CSV format. It is 5.53 MB in size and consists of 5 columns. The dataset comprises 799 records/rows.
Usage
This music dataset offers several applications for research and development:
- Music Analysis: Researchers can gain insights into the features and patterns of various music genres by analysing the connections between song elements such as genre, vocalist, and rating.
- Natural Language Processing (NLP): NLP researchers may utilise the lyrics to develop language models, sentiment analysis algorithms, topic modelling approaches, and other text-based music studies.
- Recommendation Systems: Developers can create recommendation systems that suggest music based on user preferences, lyrical sentiment, or genre similarities.
- Music Generating Machine Learning Models: The dataset may be used to train machine learning models for generating new lyrics or making music compositions.
- Music Sentiment Analysis: Researchers can analyse the sentiments conveyed in song lyrics to gain insights into the emotional components of music and its influence on listeners.
- Movie Soundtracks Analysis: Researchers can investigate the association between song attributes and their use in movie soundtracks.
Coverage
The dataset is designed to provide a wide variety of songs from diverse genres, performers, and films. It includes popular songs from numerous eras and locations, covering a broad spectrum of musical styles.
License
CC0: Public Domain
Who Can Use It
This dataset is intended for academics, developers, and music enthusiasts. Academics and researchers can use it for various analytical studies, including music analysis, NLP, and sentiment analysis. Developers can leverage it for building recommendation systems or training machine learning models for music generation. Music fans can also explore and analyse the relationship between listener preferences and lyrical content.
Dataset Name Suggestions
- Music Song and Lyric Collection
- Song Information and Lyrics Dataset
- Music Research Dataset with Lyrics
- Song Metadata and Lyric Repository
- Audio-to-Text Song Dataset
Attributes
Original Data Source: Song Information and Lyrics Dataset