Online Retail Customer Segmentation Data Set
E-commerce & Online Transactions
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
A transnational data set detailing all transactions that occurred between 1 December 2010 and 9 December 2011. The data originates from a non-store online retailer registered in the UK, which specialises in selling unique gifts suitable for all occasions. The data is profiled for use in customer segmentation studies and provides key insights into wholesale purchasing patterns, as many customers are documented as wholesalers.
Columns
- InvoiceNo: A nominal, 6-digit number that uniquely identifies each transaction. If the code begins with the letter 'c', it signifies a cancellation event.
- StockCode: A nominal, 5-digit number that uniquely identifies each distinct product or item.
- Description: The nominal name given to the product or item.
- Quantity: The numeric value representing the quantity of each product item included in the transaction.
- InvoiceDate: The numeric field recording the date and time when the transaction was generated.
- UnitPrice: The numeric price of the product per unit, denominated in sterling.
- CustomerID: A nominal, 5-digit number that uniquely identifies each customer.
- Country: The nominal name of the country where the customer resides.
Distribution
This dataset details transactional data across 8 distinct columns. The data file, typically delivered in a CSV format, is around 45.04 MB in size. The dataset includes approximately 542,000 records.
Usage
This data is ideally suited for models requiring customer segmentation. It is often used for clustering analysis to identify distinct groups of buyers. Potential use cases include retail analytics, market basket analysis, studying sales seasonality, and calculating customer lifetime value.
Coverage
The transactional activity spans from 1 December 2010 through to 9 December 2011. Geographically, the data is transnational, originating from a UK-registered retailer. Thirty-eight unique countries are represented in the customer base, although the vast majority, 91%, of the records correspond to customers residing in the United Kingdom. Note that roughly 25% of the records are missing a Customer ID value.
License
CC0: Public Domain
Who Can Use It
The intended users include data scientists focusing on market basket and retail analytics, academics studying e-commerce dynamics, and machine learning practitioners developing customer relationship management strategies. Financial analysts can use the unit price and quantity data to model revenue volatility.
Dataset Name Suggestions
- UK E-commerce Transactions Data 2010-2011
- Online Retail Customer Segmentation Data Set
- Transnational Gift Sales Records
Attributes
Original Data Source: Online Retail Customer Segmentation Data Set
Loading...
