Global LEGO Product Data
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Explore the evolution of LEGO sets, offering creative building opportunities for enthusiasts of all ages. This dataset compiles detailed information about LEGO sets spanning from 1949 to 2023, allowing users to investigate the development of these iconic building blocks over time. Each set is accompanied by essential details such as set number, set name, theme, number of pieces, release year, and an associated set image. The data, curated from the Rebrickable website, contains over 21,503 records, providing a wealth of information about LEGO sets, enabling enthusiasts and data analysts to delve into the fascinating world of LEGO creations.
Columns
- set_number: A numerical identifier for each unique LEGO set product. There are 21,497 unique values.
- set_name: The name given to each individual LEGO set. This column contains 18,392 unique set names.
- year_released: The specific year when the LEGO set was first made available by the company.
- number_of_parts: The total count of individual pieces included within each LEGO set.
- image_url: A direct link to a visual representation or image of the corresponding LEGO set. There are 21,497 unique URLs.
- theme_name: The assigned theme category for each set, such as Star Wars, Ninjago, Batman, Marvel, or Technic. This column has 383 unique theme names.
Distribution
The dataset is provided in a tabular format, typically a CSV file (lego_sets_and_themes.csv), and comprises over 21,503 records across 6 distinct columns. The file size is 2.26 MB. While most columns are fully populated, a small number of entries (7 records) are missing for
set_number
, set_name
, year_released
, number_of_parts
, and image_url
. The theme_name
column has no missing values. The release years range from 1949 to 2023, with a mean release year around 2010. The number of parts per set varies significantly, with a mean of 162 and a maximum of 11,695 parts.Usage
This dataset is ideal for various analytical applications and research purposes. Users can explore how the popularity of different themes has changed over the years, identify observable patterns in the release of LEGO sets over time, or determine if certain years are associated with a higher number of releases. It allows for studying the longevity of themes, understanding which themes have remained relevant across multiple decades, and analysing the distribution of set sizes based on the number of pieces. Additionally, it can be used to examine theme diversity and identify dominant themes or relatively even distributions among various themes.
Coverage
The dataset spans a significant time range from 1949 to 2023, documenting the evolution of LEGO sets over many decades. It encompasses a wide array of themes, including popular categories like Star Wars, Ninjago, Batman, Marvel, and historical landmarks. Information on geographic or specific demographic scopes is not explicitly detailed within the dataset.
License
CC0: Public Domain
Who Can Use It
This dataset is well-suited for a diverse range of users, including:
- Avid LEGO enthusiasts looking to dive deeper into the history and evolution of their favourite toys.
- Data scientists and analysts interested in conducting time-series analysis, trend identification, or categorical data exploration.
- Researchers studying product development, market trends, or the impact of popular culture on toy manufacturing.
- Educators using real-world data for teaching data visualisation, statistical analysis, or database management.
- Individuals simply curious about the development of iconic building blocks over nearly three-quarters of a century.
Dataset Name Suggestions
- LEGO Set History (1949-2023)
- Rebrickable LEGO Sets Archive
- Global LEGO Product Data
- LEGO Theme Evolution Dataset
Attributes
Original Data Source: Global LEGO Product Data