Opendatabay APP

Spotify Song Characteristics Dataset

Product Reviews & Feedback

Tags and Keywords

Spotify

Music

Genre

Audio

Songs

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Spotify Song Characteristics Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset offers a detailed collection of Spotify song data, providing insights into various audio features across a broad spectrum of music genres. It is specifically curated for analysis by genre, encompassing categories such as Trap, Techno, Techhouse, Trance, Psytrance, Dark Trap, DnB (drums and bass), Hardstyle, Underground Rap, Trap Metal, Emo, Rap, RnB, Pop, and Hiphop. The dataset aims to facilitate the understanding of musical characteristics and their distribution among different genres.

Columns

  • danceability: Indicates how suitable a track is for dancing, based on musical elements like tempo, rhythm stability, and beat strength. Values range from 0.07 to 0.99, with a mean of 0.64, for 42.3k valid records.
  • energy: A perceptual measure of intensity and activity in a track, with values between 0.0 and 1.0. Valid for 42.3k records, it has a mean of 0.76 and a standard deviation of 0.18.
  • key: The estimated overall key of the track. Values range from 0 to 11, with a mean of 5.37 for 42.3k valid records.
  • loudness: The overall loudness of a track in decibels (dB), ranging from -33.4 to 3.15. The mean loudness is -6.47 dB for 42.3k valid records.
  • mode: Denotes the modality (major or minor) of a track, represented as 0 or 1. Valid for 42.3k records, the mean is 0.55.
  • speechiness: Detects the presence of spoken words in a track, with values from 0.02 to 0.95. The mean is 0.14 for 42.3k valid records.
  • acousticness: A confidence measure (0.0 to 1.0) of whether the track is acoustic. The mean is 0.1, with values ranging from 0.0 to 0.99 for 42.3k valid records.
  • instrumentalness: Predicts whether a track contains no vocals, with values from 0.0 to 0.99. The mean is 0.28 for 42.3k valid records.
  • liveness: Detects the presence of an audience in the recording, with values from 0.01 to 0.99. The mean is 0.21 for 42.3k valid records.
  • valence: A measure (0.0 to 1.0) describing the musical positiveness conveyed by a track. The mean is 0.36, with values from 0.02 to 0.99 for 42.3k valid records.
  • tempo: The overall estimated tempo of a track in beats per minute (BPM), ranging from 58 to 220. The mean tempo is 147 BPM for 42.3k valid records.
  • type: The type of the object, which is consistently 'audio_features' for all 42.3k records.
  • id: The Spotify ID for the track, with 35,877 unique values among 42.3k records.
  • uri: The Spotify URI for the track, with 35,877 unique values among 42.3k records.
  • track_href: A link to the Web API endpoint providing full details of the track, with 35,877 unique values among 42.3k records.
  • analysis_url: A link to the Web API endpoint providing the audio analysis of the track, with 35,877 unique values among 42.3k records.
  • duration_ms: The duration of the track in milliseconds, ranging from 25.6k to 913k ms. The mean duration is 251k ms for 42.3k valid records.
  • time_signature: An estimated overall time signature of a track, mostly 4.00, for 42.3k valid records.
  • genre: The genre of the song, with 15 unique values. Underground Rap and Dark Trap are among the most common. Valid for 42.3k records.
  • song_name: The name of the song. There are 21.5k valid records and 20.8k missing records, with 'Forever' being a frequent entry.
  • Unnamed: 0: An unnamed column, likely an index. There are 20.8k valid records and 21.5k missing records.
  • title: The title of the track. There are 20.8k valid records and 21.5k missing records, with 'Euphoric Hardstyle' being a common title.

Distribution

The dataset is provided as a CSV file, specifically named genres_v2.csv. It has a size of 13.6 MB and comprises 22 columns. The dataset contains 42,300 records for most of its audio features and identification columns. However, some columns like song_name, Unnamed: 0, and title have a notable number of missing records, with only about half of the total records being valid for these specific fields.

Usage

This dataset is highly suitable for various applications, including:
  • Music genre classification: Developing machine learning models to classify songs into their respective genres based on audio features.
  • Audio feature analysis: Exploring the distribution and characteristics of musical attributes across different genres.
  • Music recommendation systems: Building systems that recommend songs based on a user's preferred audio characteristics or genres.
  • Trend analysis: Identifying popular musical trends or characteristics within specific genres over time.
  • Academic research: Supporting studies on musicology, digital signal processing, and computational music analysis.

Coverage

The dataset focuses on music genres available on Spotify. Specific geographic, time range, or demographic scopes are not explicitly detailed within the provided information. The dataset is expected to be updated annually.

License

CC0: Public Domain

Who Can Use It

  • Data scientists and machine learning engineers for building predictive models and understanding complex data patterns in music.
  • Music analysts and researchers interested in the technical aspects and characteristics of different music genres.
  • Developers creating music-related applications, such as recommendation engines or music discovery platforms.
  • Audio engineers seeking to understand the quantitative properties of musical tracks.
  • Students and educators for learning and teaching about data analysis and music information retrieval.

Dataset Name Suggestions

  • Spotify Audio Features by Genre
  • Multi-Genre Spotify Track Data
  • Spotify Song Characteristics Dataset
  • Global Spotify Genre Analysis
  • Spotify Music Genre Audio Analytics

Attributes

Listing Stats

VIEWS

12

DOWNLOADS

1

LISTED

14/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format