Two Decades of IMDbs Most Popular Films
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset presents a curated collection of the top 100 most popular films for each year between 2003 and 2022, as determined by IMDb. Encompassing a total of 2000 movies with 13 distinct features for each, the data was meticulously gathered through web scraping directly from IMDb. It serves as an excellent resource for exploratory data analysis, offering insights into film popularity, trends, and characteristics over two decades.
Columns
- Title: The official name of the film.
- Rating: The film's average rating based on IMDb user votes.
- Year: The year in which the film was originally released.
- Month: The month of the film's release.
- Certificate: The film's certification (e.g., R, PG-13).
- Runtime: The duration of the film, measured in minutes.
- Directors: The individual or individuals credited with directing the film.
- Stars: The main actors featuring in the film.
- Genre: The categorisation of the film based on its stylistic and narrative conventions (e.g., Action, Adventure, Sci-Fi).
- Filming_location: The primary geographical location(s) where the film was shot.
- Budget: The estimated financial investment made in producing the film.
- Income: The total revenue generated by the film.
- Country_of_origin: The country or countries responsible for the film's production.
Distribution
The dataset is structured as a tabular file, named
movies.csv
, with a size of 397.29 kB. It contains 2000 individual film records and includes all 13 specified columns for each entry.Usage
This dataset is ideal for a variety of applications, including:
- Exploratory data analysis to uncover trends in film popularity and characteristics.
- Data analytics projects focusing on the entertainment industry.
- Data visualisation tasks to illustrate film performance and attributes.
- Data cleaning exercises, offering diverse data types for practice.
Coverage
The dataset covers films released annually from 2003 to 2022, specifically the 100 most popular movies for each of those years. Geographically, it includes Filming Locations such as the USA and Canada, and Countries of Origin like the United States and the United Kingdom. Data on film certifications and user ratings provides a view into audience reception and content classification.
License
CC0: Public Domain
Who Can Use It
This dataset is particularly suitable for:
- Data analysts and data scientists interested in movie industry trends.
- Researchers studying film characteristics, audience preferences, or box office performance.
- Students and educators using real-world data for learning and teaching data science principles, including exploratory analysis, visualisation, and cleaning.
Dataset Name Suggestions
- IMDb's Top Films (2003-2022)
- Popular Movies Dataset (IMDb 2003-2022)
- Two Decades of IMDb's Most Popular Films
- Global Film Ratings and Details (2003-2022)
Attributes
Original Data Source: Two Decades of IMDbs Most Popular Films