YouTube Video Title Analysis Dataset
E-commerce & Online Transactions
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a detailed collection of video titles from the popular YouTube channel, 5-Minute Crafts, which is owned by TheSoul Publishing. As of October 2021, the channel was notably the 9th most-subscribed and one of the most-viewed channels on the platform [1]. While known for its DIY-style content, 5-Minute Crafts has faced criticism for unusual or potentially risky 'life hacks' and its heavy use of clickbait [1]. Despite this, the videos consistently achieve a high volume of views [1]. The dataset includes each video's title alongside various meta-features, such as total views, video duration, and the sentiment associated with the title [1]. It is designed for analysis to explore the relationship between words used in titles and views garnered, identify key title features that impact viewership, and examine correlations between title meta-features, total views, duration, and sentiment [1].
Columns
video_id
: A unique identifier for each video [2].title
: The textual title of the video [2].active_since_days
: The number of days the video has been active [2].duration_seconds
: The length of the video in seconds [2].total_views
: The overall count of views for the video [2].num_chars
: The total number of characters present in the video title [2].num_words
: The total count of words within the video title [2].num_punctuation
: The number of punctuation marks in the title [2].num_words_uppercase
: The count of words written entirely in uppercase within the title [2].num_words_lowercase
: The count of words written entirely in lowercase within the title [2].
Distribution
The dataset comprises 4,978 unique video records from the 5-Minute Crafts YouTube channel, with 4,965 unique video titles [2].
- Video Duration: The duration of videos ranges from approximately 1 second to 1,460 seconds (about 24 minutes), with the majority falling between 1022.30 and 1168.20 seconds [3].
- Total Views: View counts range from 4,034 up to 283 million views, with most videos having between 4,034 and 28,306,741.50 views [4, 5].
- Title Characters: Video titles typically contain between 11 and 100 characters, with the most common length being 37.70 to 46.60 characters [5, 6].
- Title Words: Titles usually have between 3 and 20 words, with a peak concentration between 6.40 and 8.10 words [6, 7].
- Punctuation: The number of punctuation marks in titles ranges from 0 to 6, with most titles having very few, specifically between 0 and 0.60 punctuation marks [7].
- Uppercase Words: Titles contain between 0 and 18 uppercase words, with a notable concentration between 5.40 and 7.20 uppercase words [7, 8].
- Lowercase Words: The number of lowercase words in titles ranges from 0 to 12, with the majority of titles having between 0 and 1.20 lowercase words [8].
Usage
This dataset is well-suited for various analytical and modelling tasks, including:
- Investigating the correlation between specific words used in titles and the total views generated [1].
- Identifying which features of a video title are most impactful in driving views [1].
- Exploring the relationships between title meta-features (like character or word count), total views, video duration, and sentiment [1].
- Developing predictive models for video performance based on title characteristics.
- Performing natural language processing (NLP) tasks on video titles [1].
Coverage
The dataset focuses on videos from the 5-Minute Crafts YouTube channel [2].
- Geographic Scope: The data is globally relevant, reflecting the channel's international reach [9].
- Time Range: The dataset includes an 'active since days' column for each video, indicating its age, though specific calendar dates for data collection are not provided [1, 2].
License
CCO
Who Can Use It
This dataset is ideal for:
- Data Scientists and Analysts: For developing and testing models related to content engagement and virality.
- Content Creators and Marketers: To gain insights into effective title strategies and audience engagement on YouTube.
- Researchers: Studying online media trends, clickbait phenomena, and the dynamics of popular DIY content.
- AI/ML Developers: For training and validating NLP models on large-scale text data related to video titles [1].
Dataset Name Suggestions
- 5-Minute Crafts YouTube Performance Data
- YouTube Video Title Analysis Dataset
- Clickbait and Views Dataset
- DIY Content Engagement Metrics
- Online Video Analytics Data
Attributes
Original Data Source: 5-Minute Crafts: Video Clickbait Titles?