Opendatabay APP

YouTube Content Performance Data

Social Media and Posts

Tags and Keywords

Youtube

Metadata

Data

Science

Video

Channels

Trusted By
Trusted by company1Trusted by company2Trusted by company3
YouTube Content Performance Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This collection profiles the metadata for videos sourced from approximately 60 distinct Data Science YouTube channels. It offers essential metrics and descriptive attributes for each video, serving as a foundational resource for analysing digital content performance and trends within the data science education sphere. The dataset is suitable for large-scale statistical analysis and the development of machine learning models focused on content classification and metric prediction.

Columns

The dataset includes 21 attributes detailing video and channel characteristics:
  • channelId: The unique identification code assigned to the YouTube channel.
  • channelTitle: The publicly displayed name of the YouTube channel (e.g., Packt Video, ExcelIsFun).
  • videoId: The unique identifier for the specific video.
  • publishedAt: The publication date and time of the video, formatted as yyyy-mm-ddThh:mm:ssZ.
  • videoTitle: The exact title used for the video.
  • videoDescription: The descriptive text provided by the creator for the video.
  • videoCategoryId: The internal numerical code identifying the video's category.
  • videoCategoryLabel: The human-readable label for the video's category (e.g., Science & Technology, Education).
  • duration: The video duration in ISO 8601 format.
  • durationSec: The video duration converted into seconds.
  • dimension: Describes the video's dimensions (typically '2d').
  • definition: The quality of the video, generally listed as 'hd' or 'sd'.
  • caption: A boolean field indicating if the video includes captions.
  • viewCount: The total number of times the video has been viewed.
  • likeCount: The total number of likes received.
  • dislikeCount: The total number of dislikes received.
  • favoriteCount: The count of times the video has been marked as a favourite (currently zero for nearly all records).
  • commentCount: The total number of public comments posted on the video.
  • licensedContent: Indicates if the content is licensed.
  • publishedAtSQL: The publication date formatted for SQL querying.
  • thumbnail_maxres: The URL link to the maximum resolution thumbnail image.

Distribution

The data is contained within a single file named data-science-youtube-channel-videos-metadata.csv, weighing approximately 59.08 MB. The dataset structure features 21 columns and contains over 44,300 records detailing video metadata. Note that expected updates occur monthly.

Usage

This dataset is ideal for several applications, including:
  • Sentiment Analysis: Applying natural language processing techniques to video titles, descriptions, and comments.
  • Video Categorisation: Training models to categorise YouTube videos based on metrics and textual content.
  • Machine Learning Development: Using algorithms like Recurrent Neural Networks (RNNs) to generate realistic YouTube descriptions.
  • Statistical Performance Benchmarking: Analysing popularity metrics such as total views, likes, and comments to rank channels or videos.
  • Exploratory Data Analysis (EDA): Investigating correlations between video length, definition, publication date, and viewer engagement.

Coverage

The data covers videos published across a significant time period, beginning on 23 July 2006 and extending through to 28 August 2020. The scope is focused on the video metadata captured from roughly 60 specific Data Science YouTube channels. The primary content categories represented are Science & Technology and Education.

License

CC BY-NC-SA 4.0

Who Can Use It

  • Data Scientists and Researchers: To perform statistical analysis on digital education content and audience engagement patterns.
  • Machine Learning Engineers: To train models for prediction (e.g., predicting view count) or text generation.
  • Content Strategists: To understand which video formats (duration, category, quality) drive the highest engagement metrics on YouTube.

Dataset Name Suggestions

  1. Data Science YouTube Channel Metrics
  2. YouTube Content Performance Data
  3. Video Metadata for Data Analysts
  4. DS Channel Engagement Catalogue

Attributes

Original Data Source: YouTube Content Performance Data

Listing Stats

VIEWS

4

DOWNLOADS

0

LISTED

25/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format