Rebrickable Price and Rating Metrics
Product Reviews & Feedback
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This resource provides deep historical context for Lego products, allowing for detailed analysis of price changes, product popularity, design elements (parts, colours, materials), and thematic trends across decades. It is an essential tool for market researchers, historians of popular culture, and predictive modelling enthusiasts interested in toy industry metrics.
Columns
The dataset contains 14 distinct columns offering detailed product information:
- year: The calendar year in which the set was created.
- Theme name: The specific theme category the set belongs to.
- Sets Name: The official name of the Lego set.
- Sets URL: The image URL associated with the set's package.
- Part category: The classification of the component part.
- Part name: The specific name identifying each part within the sets.
- Part material: The material used to manufacture the parts.
- Part color: The colour of the specific component part.
- RGB: The corresponding RGB colour value.
- Is Transparent?: A boolean indicator specifying if the part is transparent.
- Part URL: The image URL for each individual part.
- Set Price: The averaged price recorded for each set.
- Number of reviews: The averaged count of customer reviews for each set.
- Star rating: The averaged star rating assigned to the set on the Lego site.
Distribution
The primary file is provided in CSV format, titled
Output.csv, with a file size of 1.12 GB. It includes 14 columns detailing set information down to the component part level. While specific total record counts are not supplied, the underlying structure contains substantial volumes of data, suitable for BigQuery applications. The data structure is prepared for easy consumption in analytical environments. This dataset is not expected to receive future updates.Usage
Ideal applications include trend analysis, machine learning model development, and historical research. Users can perform market segmentation based on price and star ratings, investigate correlations between set attributes (theme, parts, colour) and user satisfaction, and run classification, regression, or clustering models.
Coverage
The data covers Lego sets released between 1955 and 2023. Geographic scope is inferred from global product release data, encompassing the full time range available for the combined Rebrickable and pricing databases. Notes: Certain key fields such as averaged price, star rating, and number of reviews exhibit a very high rate of missing values, which should be considered during analysis.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: For developing predictive models related to product success (e.g., predicting set popularity based on components).
- E-commerce Analysts: To benchmark historical pricing strategies and understand customer valuation metrics (ratings and reviews).
- Hobbyists/Collectors: To explore the history and catalogue of Lego sets and compare pricing across decades.
Dataset Name Suggestions
- Lego Sets Historical Pricing and Popularity
- Rebrickable Price and Rating Metrics
- Decades of Lego Sets Data (1955-2023)
Attributes
Original Data Source: Rebrickable Price and Rating Metrics
Loading...
