Walmart Sales Forecasting Dataset
E-commerce & Online Transactions
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides historical sales data for 45 Walmart stores, aimed at enabling accurate sales and demand prediction. Walmart, a major US retail store, seeks to overcome challenges such as unforeseen demands and stock-outs by leveraging machine learning algorithms. The data includes key factors impacting sales, such as economic conditions like the Consumer Price Index (CPI) and Unemployment Index. The dataset also captures the effects of several promotional markdown events that precede significant holidays, specifically Super Bowl, Labour Day, Thanksgiving, and Christmas. Weeks encompassing these holidays are weighted five times higher in sales evaluation compared to non-holiday weeks. A particular challenge addressed by this dataset involves modelling the impact of markdowns during holiday weeks, especially given the absence of entirely complete historical data.
Columns
- Store: Identifies the specific Walmart store number.
- Date: Represents the week for which sales data is recorded.
- Weekly_Sales: Indicates the sales volume for a given store during that week.
- Holiday_Flag: A binary flag where '1' denotes a special holiday week and '0' signifies a non-holiday week.
- Temperature: The recorded temperature on the day of sale.
- Fuel_Price: The cost of fuel prevalent in the region.
- CPI: The prevailing consumer price index.
- Unemployment: The prevailing unemployment rate.
Distribution
This dataset is provided in a CSV format, typical for data files, with a size of approximately 363.73 KB. It contains 8 distinct columns, with 6435 valid records for each column, ensuring a robust structure for analysis. Specific numbers for rows/records are available and consistent across all listed columns.
Usage
This dataset is ideal for:
- Developing and testing regression models to accurately predict sales, utilising both single and multiple features.
- Evaluating the performance of various machine learning models using metrics like R2 and RMSE.
- Understanding the influence of economic indicators and holiday promotions on retail sales.
- Addressing business challenges related to inventory management and avoiding stock-outs due to unpredictable demand.
- Modelling the complex effects of promotional markdowns, particularly during peak holiday seasons.
Coverage
The dataset encompasses sales data from 45 distinct Walmart stores situated in various regions across the United States. The temporal scope of the data is represented by 143 unique weeks, with the earliest common date noted as 05-02-2010. While historical sales data is available, it is noted that there may be instances where complete or ideal historical data is not present, particularly for modelling holiday markdown effects.
License
CC0: Public Domain
Who Can Use It
This dataset is suitable for:
- Data Scientists and Machine Learning Engineers: For building, training, and evaluating advanced regression models for sales forecasting.
- Business Analysts: To gain insights into sales trends, the impact of external factors, and to inform strategic business decisions.
- Retail Operations Managers: To enhance demand planning, optimise inventory levels, and improve operational efficiency.
- Students and Researchers: As a practical case study for understanding time-series analysis, regression problems, and the application of machine learning in retail.
Dataset Name Suggestions
- Walmart Store Sales Prediction
- Walmart Sales Forecasting Dataset
- Walmart Retail Demand Analysis
- Walmart Historical Sales Data
Attributes
Original Data Source: Walmart Sales Forecasting Dataset