Entertainment Audience Behaviour Data
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
The world of movies and television shows, providing insights into their performance. It is designed to facilitate the prediction of streaming movie and TV show success rates through machine learning models. The data captures information related to users and content, which is valuable for understanding audience preferences and viewing behaviours across various content types.
Columns
- id: A unique identifier assigned to each user within the dataset.
- original_title: Contains the names of the movies and TV shows. Approximately 25% of entries for this column are missing, with "Murder Mystery 2" being the most frequently occurring title, representing 5% of the valid entries.
- original_language: Indicates the language in which a movie or TV show was initially produced and released. English is the most common language, making up 75% of entries, followed by Japanese at 10%. There are 5 unique languages represented.
- release_date: Specifies the date when a movie or TV show was made available to the public. This column has about 25% missing values, and the dates range from 25th May 2019 to 6th April 2023.
- popularity: Reflects the level of public interest and engagement associated with a particular movie or TV show. Values range from approximately 30.4 to 10.2 thousand, with a mean of 934.
- vote_average: Represents the average rating given to a movie or TV show by viewers or critics. Ratings span from 4.8 to 10, with an average score of 7.54.
- vote_count: The total number of votes or ratings a movie or TV show has received from its audience or critics. Counts vary from 3 to 8.7 thousand, with an average of 1.04 thousand.
- media_type: Identifies the type of content, categorised as either a 'movie' or a 'tv' show. Movies account for 75% of the entries, while TV shows make up 25%.
- adult: A boolean attribute indicating whether the user is an adult. All 16,100 records in this dataset are marked as 'false', suggesting the dataset focuses on non-adult user interactions or content classification.
Distribution
The dataset is provided as a
trending.csv
file, with a size of 1.12 MB. It consists of 10 columns and contains 16,100 individual records. The data is structured to allow for detailed analysis of entertainment content.Usage
This dataset is ideally suited for developing and testing machine learning models aimed at predicting the success rate of streaming movies and TV shows. It can also be utilised for in-depth analysis to understand audience preferences and viewing behaviours across diverse content categories. Researchers, data scientists, and content strategists can leverage this data to gain insights into factors influencing entertainment success.
Coverage
The dataset's coverage spans content released between 25th May 2019 and 6th April 2023, based on the
release_date
attribute. It is described as a finite dataset, meaning it does not receive ongoing updates. While it contains information related to users, specific geographic or broad demographic scopes beyond the 'adult' flag (which uniformly indicates 'false') are not detailed.License
CC0: Public Domain
Who Can Use It
- Machine Learning Engineers: For building and refining models to predict content success.
- Data Analysts: To explore trends in audience engagement and content performance.
- Content Strategists: To inform decision-making regarding content acquisition, production, and marketing based on audience preferences.
- Academic Researchers: For studies on media consumption, audience behaviour, and entertainment industry dynamics.
Dataset Name Suggestions
- Streaming Entertainment Performance Insights
- Movie & TV Show Audience Analytics
- Content Success Predictor Dataset
- Entertainment Audience Behaviour Data
Attributes
Original Data Source: Entertainment Audience Behaviour Data