Customer Purchase History UK
E-commerce & Online Transactions
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains transactional data from a UK-based non-store online retail [1]. It records various aspects of customer purchases, including product details, quantities, prices, invoice dates, and customer information [1, 2]. This data is valuable for understanding online retail operations and consumer behaviour patterns.
Columns
- InvoiceNo: A 6-digit number uniquely identifying each transaction. If the code begins with the letter 'c', it indicates a cancelled transaction. This is a nominal field [1]. There are 25,900 unique invoice numbers [2].
- StockCode: A 5-digit number serving as a unique identifier for each distinct product or item. This is a nominal field [1]. There are 4,070 unique product codes [3].
- Description: The name of the product or item sold. This is a nominal field [1]. There are 4,224 unique product descriptions, with some missing values [3].
- Quantity: The number of units of each product included in a transaction. This is a numeric field [1]. Quantities range widely, including negative values, with a mean of 9.55 [4].
- InvoiceDate: The date and time when each transaction was generated. This is a numeric field [1]. The data spans from 1st December 2010 to 9th December 2011 [5].
- UnitPrice: The price per unit of the product in sterling. This is a numeric field [2]. The mean unit price is 4.61 [6].
- CustomerID: A 5-digit number uniquely assigned to each customer. This is a nominal field [2]. Approximately 25% of customer ID values are missing [7].
- Country: The name of the country where the customer resides. This is a nominal field [2]. There are 38 unique countries represented, with the United Kingdom accounting for 91% of entries [7].
Distribution
The dataset is typically provided as a CSV file, with a file size of 48.58 MB [2, 8]. It comprises 8 distinct columns [2]. Most columns, such as InvoiceNo, StockCode, Quantity, InvoiceDate, UnitPrice, and Country, contain approximately 542,000 records [2-7]. The 'Description' column has around 540,000 valid entries, while 'CustomerID' has 407,000 valid entries [3, 7].
Usage
This dataset is ideal for:
- Customer segmentation: Identifying distinct customer groups based on purchasing behaviour using CustomerID.
- Market basket analysis: Discovering product associations and frequently bought together items using InvoiceNo, StockCode, and Quantity.
- Sales trend analysis: Analysing seasonal patterns, peak sales periods, and overall revenue trends over time using InvoiceDate, Quantity, and UnitPrice.
- Geographic sales analysis: Understanding sales performance and customer distribution across different countries.
- Inventory management: Predicting demand for specific products based on historical sales data.
- Churn prediction: Potentially identifying customers at risk of no longer purchasing (though additional features might be needed).
- Fraud detection: Investigating unusual transaction patterns, such as large negative quantities or high-value cancellations.
Coverage
- Geographic: The data primarily covers transactions from the United Kingdom, accounting for 91% of all records. It also includes transactions from 37 other countries [7].
- Time Range: The transactions recorded span from 1st December 2010 to 9th December 2011 [5].
- Demographic: While customer IDs are included, specific demographic details such as age, gender, or income are not part of this dataset. Approximately 25% of customer records do not have an associated CustomerID [7].
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
- Data Scientists and Machine Learning Engineers: For developing predictive models related to sales forecasting, customer lifetime value, and recommendation systems.
- Business Intelligence Analysts: For creating dashboards and reports on sales performance, customer trends, and product popularity.
- E-commerce Managers: To inform strategic decisions on pricing, promotions, product bundling, and customer retention campaigns.
- Academic Researchers: For studies in consumer behaviour, retail analytics, and economic modelling.
Dataset Name Suggestions
- UK E-commerce Transactions
- Online Retail Sales Data
- Customer Purchase History UK
- Retail Transaction Log 2010-2011
Attributes
Original Data Source: Customer Purchase History UK