Opendatabay APP

Online Retail Customer Segmentation Data Set

E-commerce & Online Transactions

Tags and Keywords

Retail

E-commerce

Transactions

Customer

Clustering

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Online Retail Customer Segmentation Data Set Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

A transnational data set detailing all transactions that occurred between 1 December 2010 and 9 December 2011. The data originates from a non-store online retailer registered in the UK, which specialises in selling unique gifts suitable for all occasions. The data is profiled for use in customer segmentation studies and provides key insights into wholesale purchasing patterns, as many customers are documented as wholesalers.

Columns

  • InvoiceNo: A nominal, 6-digit number that uniquely identifies each transaction. If the code begins with the letter 'c', it signifies a cancellation event.
  • StockCode: A nominal, 5-digit number that uniquely identifies each distinct product or item.
  • Description: The nominal name given to the product or item.
  • Quantity: The numeric value representing the quantity of each product item included in the transaction.
  • InvoiceDate: The numeric field recording the date and time when the transaction was generated.
  • UnitPrice: The numeric price of the product per unit, denominated in sterling.
  • CustomerID: A nominal, 5-digit number that uniquely identifies each customer.
  • Country: The nominal name of the country where the customer resides.

Distribution

This dataset details transactional data across 8 distinct columns. The data file, typically delivered in a CSV format, is around 45.04 MB in size. The dataset includes approximately 542,000 records.

Usage

This data is ideally suited for models requiring customer segmentation. It is often used for clustering analysis to identify distinct groups of buyers. Potential use cases include retail analytics, market basket analysis, studying sales seasonality, and calculating customer lifetime value.

Coverage

The transactional activity spans from 1 December 2010 through to 9 December 2011. Geographically, the data is transnational, originating from a UK-registered retailer. Thirty-eight unique countries are represented in the customer base, although the vast majority, 91%, of the records correspond to customers residing in the United Kingdom. Note that roughly 25% of the records are missing a Customer ID value.

License

CC0: Public Domain

Who Can Use It

The intended users include data scientists focusing on market basket and retail analytics, academics studying e-commerce dynamics, and machine learning practitioners developing customer relationship management strategies. Financial analysts can use the unit price and quantity data to model revenue volatility.

Dataset Name Suggestions

  • UK E-commerce Transactions Data 2010-2011
  • Online Retail Customer Segmentation Data Set
  • Transnational Gift Sales Records

Attributes

Listing Stats

VIEWS

8

DOWNLOADS

1

LISTED

29/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format