Opendatabay APP

Walmart Sales Forecasting Dataset

E-commerce & Online Transactions

Tags and Keywords

Walmart

Sales

Prediction

Retail

Regression

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Walmart Sales Forecasting Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides historical sales data for 45 Walmart stores, aimed at enabling accurate sales and demand prediction. Walmart, a major US retail store, seeks to overcome challenges such as unforeseen demands and stock-outs by leveraging machine learning algorithms. The data includes key factors impacting sales, such as economic conditions like the Consumer Price Index (CPI) and Unemployment Index. The dataset also captures the effects of several promotional markdown events that precede significant holidays, specifically Super Bowl, Labour Day, Thanksgiving, and Christmas. Weeks encompassing these holidays are weighted five times higher in sales evaluation compared to non-holiday weeks. A particular challenge addressed by this dataset involves modelling the impact of markdowns during holiday weeks, especially given the absence of entirely complete historical data.

Columns

  • Store: Identifies the specific Walmart store number.
  • Date: Represents the week for which sales data is recorded.
  • Weekly_Sales: Indicates the sales volume for a given store during that week.
  • Holiday_Flag: A binary flag where '1' denotes a special holiday week and '0' signifies a non-holiday week.
  • Temperature: The recorded temperature on the day of sale.
  • Fuel_Price: The cost of fuel prevalent in the region.
  • CPI: The prevailing consumer price index.
  • Unemployment: The prevailing unemployment rate.

Distribution

This dataset is provided in a CSV format, typical for data files, with a size of approximately 363.73 KB. It contains 8 distinct columns, with 6435 valid records for each column, ensuring a robust structure for analysis. Specific numbers for rows/records are available and consistent across all listed columns.

Usage

This dataset is ideal for:
  • Developing and testing regression models to accurately predict sales, utilising both single and multiple features.
  • Evaluating the performance of various machine learning models using metrics like R2 and RMSE.
  • Understanding the influence of economic indicators and holiday promotions on retail sales.
  • Addressing business challenges related to inventory management and avoiding stock-outs due to unpredictable demand.
  • Modelling the complex effects of promotional markdowns, particularly during peak holiday seasons.

Coverage

The dataset encompasses sales data from 45 distinct Walmart stores situated in various regions across the United States. The temporal scope of the data is represented by 143 unique weeks, with the earliest common date noted as 05-02-2010. While historical sales data is available, it is noted that there may be instances where complete or ideal historical data is not present, particularly for modelling holiday markdown effects.

License

CC0: Public Domain

Who Can Use It

This dataset is suitable for:
  • Data Scientists and Machine Learning Engineers: For building, training, and evaluating advanced regression models for sales forecasting.
  • Business Analysts: To gain insights into sales trends, the impact of external factors, and to inform strategic business decisions.
  • Retail Operations Managers: To enhance demand planning, optimise inventory levels, and improve operational efficiency.
  • Students and Researchers: As a practical case study for understanding time-series analysis, regression problems, and the application of machine learning in retail.

Dataset Name Suggestions

  • Walmart Store Sales Prediction
  • Walmart Sales Forecasting Dataset
  • Walmart Retail Demand Analysis
  • Walmart Historical Sales Data

Attributes

Original Data Source: Walmart Sales Forecasting Dataset

Listing Stats

VIEWS

1

DOWNLOADS

1

LISTED

14/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format