IMDb Film Insights 25k
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
25,000 entries of IMDb movie data**, meticulously scraped from the IMDb.com website. It offers a rich collection of film-related information, ideal for various analytical and research endeavours. The data provides insights into movie titles, run times, ratings, genres, and cast details, making it a valuable resource for understanding movie characteristics and audience reception.
Columns
- movie title: The title of the film as listed on IMDb. (23,922 unique values)
- Total Run Time: The duration of the movie. (35% are 'not-released' or similar, 1,556 unique values)
- Movie Rating: The average user rating given to the movie. (91 unique values, 7% are 'no-rating')
- User Rating: The total count of users who rated the movie. (1,684 unique values, 7% are '0')
- Genres: The categories or types of the movie, often listed as an array. (746 unique values, 'Drama' is most common)
- Overview: A short synopsis or description of the movie. (24,000 unique values, 1% missing)
- Plot Kyeword: Keywords associated with the movie's plot. (21,500 unique values, 7% are empty)
- Director: The name of the movie's director. (11,604 unique values)
- Top 5 Casts: The names of the top five actors in the movie. (24,211 unique values)
- Writer: The name of the movie's writer. (15,600 unique values)
- year: The releasing year of the movie. (250 unique values, 3% missing, up to 2022)
- path: The IMDb movie URL path. (23,922 unique values)
Distribution
The dataset is provided in CSV format and measures approximately 12.52 MB. It consists of 12 distinct columns across approximately 24,400 records.
Usage
This dataset is well-suited for a variety of applications, including:
- Academic research into film trends, audience behaviour, and cinematic attributes.
- Developing Deep Learning models for movie analysis or content generation.
- Building Recommender Systems to suggest films based on user preferences or movie characteristics.
- Natural Language Toolkit (NLTK) applications for processing movie overviews, plot keywords, and titles.
Coverage
The dataset's scope encompasses movies with release years recorded up to 2022. While specific geographic or demographic details are not explicitly itemised, the data originates from IMDb.com, which generally covers a global array of films and associated user information.
License
CC BY-NC-SA 4.0
Who Can Use It
- Researchers studying film industry trends, audience sentiment, and narrative structures.
- Data Scientists and Machine Learning Engineers for training models in recommendation engines or content classification.
- Students undertaking projects related to data analysis, web scraping, or natural language processing.
- Film Enthusiasts interested in exploring movie details and statistics.
Dataset Name Suggestions
- IMDb Film Insights 25k
- Movie Database from IMDb
- Global Film Ratings & Details
- 25,000 IMDb Movie Entries
- Cinematic Data Collection
Attributes
Original Data Source: IMDb Film Insights 25k