Halloween Candy Popularity Scores
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Presents the results of a 2017 experiment conducted by Walt Hickey for FiveThirthyEight aimed at determining the most popular Halloween candy among online voters. The data facilitates the analysis of candy preferences by compiling approximately 269,000 votes collected from over 8,300 different IP addresses. Each candy is assigned a calculated win percentage based on head-to-head matchups, and the dataset includes several essential attributes—such as whether a candy contains chocolate—to help identify the common traits of popular sweets.
Columns
The dataset contains 13 columns detailing the candy and its characteristics:
- competitorname: The specific name of the candy (85 unique values).
- chocolate: Binary indicator (1 if chocolate is present, 0 otherwise).
- fruity: Binary indicator (1 if the candy is fruity, 0 otherwise).
- caramel: Binary indicator (1 if the candy contains caramel, 0 otherwise).
- peanutyalmondy: Binary indicator (1 if the candy contains peanuts or almonds, 0 otherwise).
- nougat: Binary indicator (1 if the candy contains nougat, 0 otherwise).
- crispedricewafer: Binary indicator (1 if the candy contains crisped rice or wafer, 0 otherwise).
- hard: Binary indicator (1 if the candy is hard, 0 otherwise).
- bar: Binary indicator (1 if the candy is shaped like a bar, 0 otherwise).
- pluribus: Binary indicator (1 if the candy comes in multiple pieces, 0 otherwise).
- sugarpercent: A value representing the sugar content percentile (ranging from 0.01 to 0.99).
- pricepercent: A value representing the relative price percentile (ranging from 0.01 to 0.98).
- winpercent: The calculated percentage of matchups the candy won, indicating voter preference (ranging from 22.4% to 84.2%).
Distribution
The data is typically provided in a CSV format, specifically named
candy-data.csv. The file contains 13 distinct columns and 85 records, with all records being valid and none recorded as missing. The expected update frequency for this specific dataset is listed as "Never".Usage
This data is ideal for various analytical applications, including:
- Statistical analysis to identify which candy attributes correlate strongly with high win percentages.
- Market research focusing on consumer preference drivers for confectionary items.
- Data segmentation projects aimed at grouping candies based on shared features, cost, or popularity.
- Predictive modelling to forecast the appeal of similar new products.
- Answering specific questions such as determining the top three candies to distribute during Halloween.
Coverage
The data collection occurred in 2017, sourced from an online experiment. Voting was based on input from 8,371 unique IP addresses. The scope is limited to the 85 competitor candy types included in the original survey.
License
CC0: Public Domain
Who Can Use It
- Data Scientists: For practicing feature engineering and regression analysis on discrete preference data.
- Researchers/Academics: Studying crowd-sourced opinion and statistical correlation between product features and popularity.
- Confectionary Manufacturers: Gaining insights into which ingredients (e.g., chocolate, caramel) drive consumer preference.
Dataset Name Suggestions
- Halloween Candy Popularity Scores
- FiveThirtyEight Candy Preference Ranking
- Candy Attribute and Win Rate Data
Attributes
Original Data Source: Halloween Candy Popularity Scores
Loading...
