Hot-100 Song Lyrics and Audio Features
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for music analytics, recommendation systems, and cultural studies [1]. It brings together information from Billboard Hot-100 charts, Genius for song lyrics, and Spotify for audio features [1]. The dataset provides detailed information for popular songs spanning the years 2000 to 2023 [1].
Columns
The main CSV file contains the following columns [2]:
- ranking: The rank of the song for a given year.
- song: The title of the song.
- band_singer: The name of the singer or band performing the song.
- songurl: A URL specific to the song.
- titletext: Additional title text.
- url: A general URL.
- year: The year of the chart.
- lyrics: The lyrics of the song.
- uri: A Uniform Resource Identifier.
- danceability: A Spotify-derived feature indicating how suitable a track is for dancing, based on musical elements like tempo, rhythm stability, beat strength, and overall regularity [2, 3].
Distribution
The dataset is provided as a main CSV file [2]. While specific total row counts are not available, the data covers various distributions. For example, song rankings range from 1 to 100 [2], and years covered are from 2000 to 2023 [4]. Danceability scores range from approximately 0.19 to 0.96 [3]. Unique values are present for song titles, artists, and URLs, with a notable portion of artists falling into an 'Other' category beyond top acts like Drake and Rihanna [4].
Usage
This dataset is ideal for various applications and use cases [5]:
- Music analytics: For understanding trends in popular music [1].
- Recommendation systems: To build models for suggesting songs [1].
- Cultural studies: For research into music and its societal impact [1].
- Data visualisation: To create visual insights from music data [6].
- Exploratory data analysis: For discovering patterns and insights [6].
- Natural Language Processing (NLP): For analysing lyrical content [6].
Coverage
The dataset covers songs from the Billboard Hot-100 charts [1] and has a global region coverage [7]. The time range for the data is from 2000 to 2023 [1, 4].
License
CCO
Who Can Use It
This dataset is suitable for a variety of users [5]:
- Data scientists: For machine learning projects such as building recommender systems or predictive models.
- Data analysts: To perform detailed analysis of music industry trends and song characteristics.
- Researchers: For academic studies in fields like musicology, digital humanities, or social sciences.
- Developers: For creating applications that utilise rich music metadata.
Dataset Name Suggestions
- Billboard Top Hits 2000-2023
- Hot-100 Song Lyrics and Audio Features
- Music Chart Data with Spotify Integration
- Popular Songs 2000-2023 Analysis Dataset
Attributes
Original Data Source: Original Data Source: