Market Basket Analysis Sample
E-commerce & Online Transactions
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset, titled "Retail Sales and Customer Demographics," is a synthetic creation designed to simulate a dynamic retail environment. Its primary purpose is to offer an ideal space for individuals keen on honing their data analysis skills through exploratory data analysis (EDA). It provides a snapshot of a fictional retail landscape, capturing essential attributes that drive retail operations and customer interactions [1, 2]. The dataset allows for the unravelling of intricate sales patterns and customer profiles, enabling users to draw valuable insights and gain a deeper understanding of customer purchasing behaviours [1]. Although synthetic, it mirrors real-world retail scenarios, offering diverse insights from demographic trends to product preferences, and facilitates the generation of hypotheses for further analysis. Ultimately, it aims to help users uncover actionable insights that retailers could leverage to enhance their strategies and customer experiences [3, 4].
Columns
The dataset contains 9 columns, each providing distinct details about retail transactions and customer characteristics [2, 5]:
- Transaction ID: An integer (
int64
) identifying each unique transaction, ranging from 1 to 1000 [2, 5]. - Date: A datetime (
datetime64
) field indicating the transaction date, spanning from 1st January 2023 to 1st January 2024 [2, 5, 6]. - Customer ID: An object (
object
) representing unique customer identifiers, with 1000 distinct values [2, 6]. - Gender: An object (
object
) indicating the customer's gender, with 51% female and 49% male records [2, 6, 7]. - Age: An integer (
int64
) representing the customer's age, ranging from 18 to 64 years [2, 7]. - Product Category: An object (
object
) detailing the category of the purchased product, including Clothing (35%), Electronics (34%), and Other (31%) [2, 7]. - Quantity: An integer (
int64
) indicating the number of units purchased in a transaction, with values ranging from 1 to 4 [2, 8]. - Price per Unit: An integer (
int64
) representing the price of a single unit of the product, ranging from 25 to 500 [2, 8]. - Total Amount: An integer (
int64
) indicating the total amount spent per transaction, with values ranging from 25 to 2000 [2, 8, 9].
Distribution
The dataset is provided as a CSV file (
retail_sales_dataset.csv
), with a file size of 51.67 kB [10, 11]. It is structured with 9 columns and contains 1000 records or rows, with all data points being valid and no missing values across any attribute [2, 5-9].Usage
This dataset is ideal for various analytical applications and use cases, particularly for those engaged in exploratory data analysis (EDA) [1]. It supports:
- Uncovering patterns: Investigating how customer age and gender influence purchasing behaviour, or discerning patterns in sales across different time periods [3].
- Product analysis: Identifying which product categories hold the highest appeal among customers and understanding relationships between age, spending, and product preferences [4].
- Strategic insights: Analysing how customers adapt their shopping habits during seasonal trends and gleaning insights from the distribution of product prices within each category [4].
- Behavioural studies: Examining distinct purchasing behaviours based on the number of items bought per transaction [4].
- Skill development: Practising data visualisation, statistical analysis, and correlation examination to refine analytical skills and contribute to the retail industry's narrative [4, 11].
Coverage
The dataset presents a fictional retail landscape, thus its geographic scope is not specified as it simulates a general retail environment [2]. The time range covered by the transaction dates is from 1st January 2023 to 1st January 2024 [6]. Demographically, it includes information on gender (Female and Male) and customer age ranging from 18 to 64 years [7].
License
CC0: Public Domain
Who Can Use It
This dataset is intended for:
- Aspiring Data Analysts: Individuals eager to sharpen their data analysis skills through practical application in a familiar retail context [1, 3].
- Business Strategists: Retailers or business professionals looking to formulate hypotheses and uncover actionable insights to enhance their marketing and customer engagement strategies [3, 4].
- Students and Researchers: Those conducting studies on customer behaviour, sales trends, and demographic influences within a retail setting [1, 3].
- Data Storytellers: Individuals aiming to extract meaningful insights from data and present them in a compelling narrative [4].
Dataset Name Suggestions
- Retail Transaction Insights Dataset
- Customer Shopping Behaviour Data
- Fictional Store Sales & Demographics
- Market Basket Analysis Sample
- Consumer Purchase Pattern Data
Attributes
Original Data Source: Market Basket Analysis Sample