Opendatabay APP

Music Popularity Features Dataset

News & Media Articles

Tags and Keywords

Music

Popularity

Song

Prediction

Regression

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Music Popularity Features Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed to predict song popularity based on various musical attributes. Humans have a deep connection with songs and music, which can positively impact mood, reduce pain and anxiety, and enable emotional expression. Research highlights music's widespread benefits for physical and mental health. This particular dataset facilitates studies aimed at understanding songs and their popularity by analysing specific parameters. The core objective is to predict song popularity, presenting a straightforward yet challenging regression problem, notable for the presence of strong multicollinearity among its features.

Columns

  • song_name: The name of the song.
  • song_popularity: A numerical value representing the song's popularity, ranging from 0 to 100.
  • song_duration_ms: The duration of the song in milliseconds.
  • acousticness: A confidence measure from 0.0 to 1.0 indicating whether the track is acoustic.
  • danceability: A measure from 0.0 to 1.0 describing how suitable a track is for dancing based on musical elements like tempo, rhythm stability, beat strength, and overall regularity.
  • energy: A measure from 0.0 to 1.0 representing a perceptual measure of intensity and activity.
  • instrumentalness: Predicts whether a track contains no vocals. Values closer to 1.0 indicate a greater likelihood of the track being instrumental.
  • key: The key the track is in, represented as integers (e.g., 0 for C, 1 for C#, etc.).
  • liveness: Detects the presence of an audience in the recording. Values above 0.8 indicate a strong likelihood the track was performed live.
  • loudness: The overall loudness of a track in decibels (dB), typically ranging from -60 to 0 dB.
  • audio_mode: Indicates the modality (major or minor) of a track, with 0 typically representing minor and 1 representing major.
  • speechiness: Detects the presence of spoken words in a track. Values above 0.66 indicate spoken word, between 0.33 and 0.66 contain both music and speech, and below 0.33 indicate music and other non-speech-like tracks.
  • tempo: The overall estimated tempo of a track in beats per minute (BPM).
  • time_signature: An estimated overall time signature of a track.
  • audio_valence: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track.

Distribution

The dataset is typically provided as a CSV file, with a sample file named song_data.csv. It contains 15 columns and 18,800 records (rows). The file size is 2.22 MB. All columns are fully populated, with no missing or mismatched values.

Usage

This dataset is ideal for developing and evaluating regression models. It can be used to predict song popularity by analysing factors such as energy, acoustics, instrumentalness, liveness, and danceability. Users can clean the dataset if necessary, build various regression models, and then evaluate and compare their performance using metrics like R-squared (R2) and Root Mean Squared Error (RMSE).

Coverage

The dataset's specific geographic location, time range, and demographic scope are not detailed in the available information. However, it is noted that the dataset originates from Kaggle. All columns within the dataset are complete, with 100% valid data across all 18,800 records.

License

CC0: Public Domain

Who Can Use It

This dataset is suitable for beginners in data science and machine learning. It is particularly relevant for those interested in regression problems, including linear regression. Users can apply it to understand the factors contributing to song popularity and build predictive models.

Dataset Name Suggestions

  • Song Popularity Prediction Data
  • Music Popularity Features Dataset
  • Audio Characteristics for Popularity
  • Predicting Song Hits Data

Attributes

Original Data Source: Music Popularity Features Dataset

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

22/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format