Decades of American Film Plots
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Details of American films spanning the decades between the 1970s and the 2020s. The collection features approximately 18,000 movie titles, along with corresponding images and plot summaries. This data was compiled utilizing the Wikipedia API, selecting films based on strict criteria related to category appearance and content structure, ensuring relevance for detailed analysis.
Columns
The data structure includes three key fields:
- title: The name of the movie (contains 100% valid values in sampled decade files).
- image: A link to the associated movie image.
- plot: The detailed narrative summary, specifically required to be the 'Plot' section from Wikipedia (excluding ‘Synopsis’ or ‘Summary’ sections).
Distribution
The dataset is delivered in the standard CSV format, containing columns for title, image, and plot. It comprises nearly 18,000 records in total. The record count is high, though specific decade files vary in size (for instance, the 1970s sample file contains 1,770 records).
Usage
The collection is particularly well-suited for advanced analytical tasks. Ideal applications include natural language processing (NLP) initiatives and various machine learning projects, notably the development of movie recommender systems.
Coverage
The scope is focused exclusively on American movies released between the 1970s and the 2020s. For a film to be included, it had to have a Wikipedia page appearing in the decade's specific category for American films. It also required a ‘Plot’ section and an image that was not a Wikipedia placeholder. Due to these rigorous inclusion criteria, the data represents a curated selection and does not contain every single movie released during this period. The expected update frequency is annual.
License
CC BY-SA 3.0
Who Can Use It
Data scientists and machine learning engineers looking to train NLP models using narrative text or build efficient recommender systems. Academics studying cinematic trends or narrative structures across different decades. Developers seeking structured, high-quality movie metadata for application development.
Dataset Name Suggestions
- Wikipedia American Movies 1970-2020
- Decades of American Film Plots
- WIKI Film Metadata Collection
- 18k American Movie Details
Attributes
Original Data Source: Decades of American Film Plots
Loading...
