Historical Spotify Song Metrics
Social Media and Posts
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset offers detailed Spotify track information, suitable for those looking to improve their PySpark, SQL, or machine learning skills. Each entry represents a unique track, featuring fundamental variables such as title, artist, and release year. Additionally, it includes a range of musical elements like tempo, danceability, and key, with these values being algorithmically generated by Spotify based on technical parameters.
Columns
- id: (string) A unique identifier for the track. There are 169,909 distinct track identifiers.
- name: (string) The name of the track. There are 132,939 unique track names.
- artists: (string) The artists associated with the track. This column contains 33,375 unique artist entries.
- duration_ms: (float) The length of the track in milliseconds. Values range from 5,108 ms to 5.40 million ms, with an average of 231,000 ms.
- release_date: (date) The specific release date of the track. There are 10,882 distinct release dates.
- year: (integer) The release year of the track. Years span from 1921 to 2020, with an average year of 1980.
- acousticness: (float) A measure of how acoustic the track is, ranging from 0 to 1. The average acousticness is 0.49.
- danceability: (float) A metric indicating how suitable a track is for dancing, from 0 to 0.99. The average danceability is 0.54.
- energy: (float) A perceptual measure of intensity and activity, from 0 to 1. The average energy level is 0.49.
- instrumentalness: (float) Predicts whether a track contains no vocals, ranging from 0 to 1. The average instrumentalness is 0.16.
- liveness: (float) Detects the presence of an audience in the recording, from 0 to 1. The average liveness is 0.21.
- loudness: (float) The overall loudness of a track in decibels (dB), with values from -60 dB to 3.85 dB. The average loudness is -11.4 dB.
- speechiness: (float) Detects the presence of spoken words in a track, from 0 to 0.97. The average speechiness is 0.09.
- tempo: (float) The overall estimated tempo of a track in beats per minute (BPM). Values range from 0 to 244 BPM, with an average of 117 BPM.
- valence: (float) A measure from 0 to 1 describing the musical positivity conveyed by a track. The average valence is 0.53.
- mode: (integer) Indicates the modality (major or minor) of a track, represented by 0 or 1.
- key: (integer) The estimated overall key of the track, represented by pitches 0 to 11. The average key is 5.2.
- popularity: (integer) A popularity score for the track, ranging from 0 to 100. The average popularity is 31.6.
- explicit: (integer) Indicates the presence of explicit content, represented by 0 (implicit) or 1 (explicit).
Distribution
The dataset is provided in CSV format (spotify-data.csv) and is approximately 27.21 MB in size. All 19 columns have 170,000 valid records, indicating a complete and well-structured dataset.
Usage
This dataset is ideal for:
- Developing and refining PySpark, SQL, or machine learning capabilities.
- Analysing trends in music characteristics over time.
- Building recommendation systems based on audio features.
- Exploring the relationship between musical elements and track popularity.
- Conducting research into audio processing and musicology.
Coverage
The dataset spans tracks released from 1921 to 2020, providing a broad historical scope of Spotify music data. Specific geographic or demographic coverage details are not available.
License
CC0: Public Domain
Who Can Use It
- Data scientists and analysts seeking real-world datasets for practice.
- Machine learning engineers interested in building models for music classification or recommendation.
- Students and educators for academic projects and teaching purposes in data science courses.
- Music researchers exploring audio features and their evolution.
Dataset Name Suggestions
- Spotify Track Audio Features (1921-2020)
- Historical Spotify Song Metrics
- PySpark Music Analysis Dataset
- Global Spotify Track Characteristics
- Comprehensive Music Feature Dataset
Attributes
Original Data Source: Historical Spotify Song Metrics