Groceries Market Basket Analysis Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for Market Basket Analysis (MBA), providing a collection of groceries transaction data. It has been adapted from an initial groceries dataset and fragmented into two distinct CSV files to facilitate MBA implementation. One file,
groceries data.csv
, is suitable for Exploratory Data Analysis (EDA) and pre-processing before being fed into the Apriori algorithm. The second file, basket.csv
, contains pre-processed data, requiring only NaN replacement and encoding via a TransactionEncoder before direct input into the Apriori algorithm.Columns
- Member_number: A unique identifier for each member [1].
- Date: The specific date of the transaction [2].
- itemDescription: The name of the purchased item [3].
- year: The year in which the transaction occurred [3].
- month: The month in which the transaction occurred [4].
- day: The day of the month on which the transaction occurred [4].
- day_of_week: The day of the week on which the transaction occurred [5].
Distribution
The dataset is provided in CSV format [1, 6]. The
groceries data.csv
file has a size of 1.57 MB [1]. Both groceries data.csv
and basket.csv
contain 7 columns [1]. The dataset includes 38,800 records across all columns [2-5, 7, 8]. There are two main data files: groceries data.csv
for initial EDA and pre-processing, and basket.csv
which is pre-processed for direct use with the Apriori algorithm [6].Usage
This dataset is ideally suited for Market Basket Analysis (MBA) [6]. It can be used to perform Exploratory Data Analysis (EDA) and to pre-process transaction data for input into the Apriori algorithm [6]. The pre-processed
basket.csv
file allows for direct encoding and application of the Apriori algorithm [6].Coverage
The dataset covers transactions from 1st January 2014 to 30th December 2015 [7]. The available years are 2014 and 2015 [3]. No specific geographic or demographic scope is detailed in the available information.
License
CC0: Public Domain
Who Can Use It
This dataset is intended for:
- Data analysts looking to perform market basket analysis [6].
- Machine learning practitioners implementing association rule mining algorithms like Apriori [6].
- Researchers in retail, marketing, or consumer behaviour studies [1].
- Students learning about data pre-processing, EDA, and market basket analysis [6].
Dataset Name Suggestions
- Groceries Market Basket Analysis Data
- Retail Transaction Data for MBA
- Apriori Groceries Transaction Dataset
- Consumer Purchase Habits (Groceries)
Attributes
Original Data Source: Groceries Market Basket Analysis Data