Opendatabay APP

Hot-100 Song Lyrics and Audio Features

Data Science and Analytics

Tags and Keywords

Music

Data

Exploratory

Nlp

Recommender

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Hot-100 Song Lyrics and Audio Features Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for music analytics, recommendation systems, and cultural studies [1]. It brings together information from Billboard Hot-100 charts, Genius for song lyrics, and Spotify for audio features [1]. The dataset provides detailed information for popular songs spanning the years 2000 to 2023 [1].

Columns

The main CSV file contains the following columns [2]:
  • ranking: The rank of the song for a given year.
  • song: The title of the song.
  • band_singer: The name of the singer or band performing the song.
  • songurl: A URL specific to the song.
  • titletext: Additional title text.
  • url: A general URL.
  • year: The year of the chart.
  • lyrics: The lyrics of the song.
  • uri: A Uniform Resource Identifier.
  • danceability: A Spotify-derived feature indicating how suitable a track is for dancing, based on musical elements like tempo, rhythm stability, beat strength, and overall regularity [2, 3].

Distribution

The dataset is provided as a main CSV file [2]. While specific total row counts are not available, the data covers various distributions. For example, song rankings range from 1 to 100 [2], and years covered are from 2000 to 2023 [4]. Danceability scores range from approximately 0.19 to 0.96 [3]. Unique values are present for song titles, artists, and URLs, with a notable portion of artists falling into an 'Other' category beyond top acts like Drake and Rihanna [4].

Usage

This dataset is ideal for various applications and use cases [5]:
  • Music analytics: For understanding trends in popular music [1].
  • Recommendation systems: To build models for suggesting songs [1].
  • Cultural studies: For research into music and its societal impact [1].
  • Data visualisation: To create visual insights from music data [6].
  • Exploratory data analysis: For discovering patterns and insights [6].
  • Natural Language Processing (NLP): For analysing lyrical content [6].

Coverage

The dataset covers songs from the Billboard Hot-100 charts [1] and has a global region coverage [7]. The time range for the data is from 2000 to 2023 [1, 4].

License

CCO

Who Can Use It

This dataset is suitable for a variety of users [5]:
  • Data scientists: For machine learning projects such as building recommender systems or predictive models.
  • Data analysts: To perform detailed analysis of music industry trends and song characteristics.
  • Researchers: For academic studies in fields like musicology, digital humanities, or social sciences.
  • Developers: For creating applications that utilise rich music metadata.

Dataset Name Suggestions

  • Billboard Top Hits 2000-2023
  • Hot-100 Song Lyrics and Audio Features
  • Music Chart Data with Spotify Integration
  • Popular Songs 2000-2023 Analysis Dataset

Attributes

Original Data Source: Original Data Source:

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

16/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free