Global Film Metrics Dataset
News & Media Articles
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is an extensive collection of information pertaining to 4,803 films, offering a broad array of details for each. It is designed for in-depth analysis of the film industry, including financial performance, genre trends, and audience engagement. The dataset's purpose is to provide insights into movie characteristics, production details, and critical reception.
Columns
The dataset comprises 24 distinct columns, each detailing a specific aspect of a movie:
- index: An integer serving as the dataframe's index.
- budget: An integer representing the production budget of the movie.
- genres: An object containing the genres associated with the movie.
- homepage: An object storing the official homepage URL for the movie, if available.
- id: A unique integer identifier for each movie.
- keywords: An object listing keywords related to the movie's theme or content.
- original_language: An object indicating the movie's original language.
- original_title: An object providing the movie's original title.
- overview: An object containing a brief summary or synopsis of the movie.
- popularity: A float value representing the movie's popularity score.
- production_companies: An object detailing the companies involved in the movie's production.
- production_countries: An object specifying the countries where the movie was produced.
- release_date: An object indicating the movie's release date.
- revenue: An integer representing the revenue generated by the movie.
- runtime: A float value indicating the duration of the movie in minutes.
- spoken_languages: An object listing the languages spoken within the movie.
- status: An object describing the movie's current status (e.g., Released, Rumored).
- tagline: An object containing the movie's tagline or slogan.
- title: An object providing the movie's main title.
- vote_average: A float value representing the average rating given to the movie by users.
- vote_count: An integer indicating the total number of votes received by the movie.
- cast: An object detailing the cast members of the movie.
- crew: An object detailing the crew members involved in making the movie.
- director: An object specifying the director of the movie.
Distribution
The dataset is provided as a CSV (Comma Separated Values) file, named
movie_dataset.csv
, with a size of 23.43 MB. It consists of 4,803 individual movie records, each detailed across 24 columns, offering a structured representation of movie-related data.Usage
This dataset is ideal for various analytical applications, including:
- Film Industry Analysis: Studying trends in movie budgets, revenues, and profitability.
- Genre Analysis: Investigating the popularity and financial performance of different film genres.
- Audience Engagement Studies: Analysing popularity scores, vote averages, and vote counts to understand audience reception.
- Data Visualisation: Creating visual representations of movie data to uncover patterns and insights.
- Exploratory Data Analysis: Conducting initial investigations to discover relationships and anomalies within film data.
- Predictive Modelling: Developing models to forecast movie success based on various attributes.
Coverage
The dataset spans a significant time range for movie releases, from September 1916 to February 2017. Geographically, it covers films produced in various countries, with a notable majority originating from the United States of America (62%). The most frequently occurring original language for the movies is English (94%), and similarly, English is the most common spoken language (66%) within the films. There are no specific demographic breakdowns provided for the subjects of the movies themselves, as the data focuses on the films' attributes.
License
CC0: Public Domain
Who Can Use It
This dataset is suitable for:
- Data Scientists and Analysts: For performing statistical analysis, building predictive models, and extracting actionable insights from movie data.
- Film Researchers and Academics: For studying film history, industry trends, and cultural impact.
- Students: As a practical resource for learning data analysis, programming, and data visualisation techniques.
- Film Enthusiasts: To explore detailed information about their favourite movies and discover new connections.
- Business Intelligence Professionals: To understand market trends and competitive landscapes within the entertainment sector.
Dataset Name Suggestions
- Global Film Metrics Dataset
- Cinema Financials and Characteristics
- Historical Movie Data Collection
- Film Industry Analysis Data
- Movie Statistics Compendium
Attributes
Original Data Source: Global Film Metrics Dataset