Opendatabay APP

Retail Transaction & User Persona Data

Retail & Consumer Behavior

Tags and Keywords

Retail

Sales

Demographics

Behaviour

Forecasting

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Retail Transaction & User Persona Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Gain deep insights into retail consumer behaviour and purchasing power with this detailed record of 550,000 transactions recorded during Black Friday sales. The data enables the analysis of spending patterns across various demographics, linking customer attributes such as age, gender, marital status, and occupation to specific product categories and purchase amounts. This resource is essential for understanding high-demand periods and optimising inventory or marketing strategies based on customer personas.

Columns

  • User_ID: Unique identifier for the customer (implied index).
  • Gender: Demographic indicator containing values for Male (M) and Female (F).
  • Age: Categorical bins representing the age group of the customer (e.g., 26-35, 36-45).
  • Occupation: Masked integer code representing the customer's profession.
  • City_Category: Anonymised classification of the city (A, B, C) where the transaction occurred.
  • Stay_In_Current_City_Years: Count of years the customer has resided in their current city.
  • Marital_Status: Binary indicator representing the customer's status (Single or Married).
  • Product_Category_1: The primary category of the purchased item (masked numeric).
  • Product_Category_2: The secondary category of the purchased item (masked numeric).
  • Product_Category_3: The tertiary category of the purchased item (masked numeric).
  • Purchase: The monetary value of the transaction.

Distribution

The data is structured in a CSV format (Black_Friday_tranformed.csv) with a total file size of approximately 23.29 MB. It contains exactly 550,000 valid records (100% validity with no missing values reported for the primary index). The purchase values exhibit a mean of approximately 9,260 with a standard deviation of 5,020.

Usage

  • Predictive Modelling: Develop regression models to forecast purchase amounts based on demographic variables.
  • Customer Segmentation: Cluster consumers into personas based on spending habits, age, and occupation.
  • Product Association Analysis: Investigate correlations between different product categories (Category 1, 2, and 3).
  • Demographic Profiling: Analyse the impact of city category and residency duration on spending behaviour.

Coverage

The data represents a diverse consumer base but shows specific concentrations:
  • Demographics: The dataset is predominantly Male (75%) compared to Female (25%). The largest age group is 26-35 (40%), followed by 36-45.
  • Marital Status: Single customers make up 59% of the records, while 41% are married.
  • Geography: City Category B is the most common (42%), followed by C (31%).
  • Residency: The largest segment of users (35%) has lived in their current city for 1 year.

License

CC0: Public Domain

Who Can Use It

  • Retail Analysts: To optimise stock levels and pricing strategies for peak sales periods.
  • Data Scientists: For training regression and classification models on clean, structured tabular data.
  • Marketing Managers: To tailor advertising campaigns toward high-value demographic segments.
  • Business Intelligence Professionals: To visualise sales performance and customer retention metrics.

Dataset Name Suggestions

  • Black Friday Retail Intelligence
  • Consumer Demographics and Sales Performance
  • Black Friday Customer Spend Analysis
  • Retail Transaction & User Persona Data

Attributes

Listing Stats

VIEWS

7

DOWNLOADS

3

LISTED

02/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format