Telecom Customer Churn Prediction Data
Synthetic Data Generation
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains synthetic data designed for the modelling and analysis of telecom customer churn. It provides insights into customer attributes and their churn status, with each row representing an individual customer and columns detailing various characteristics and behaviours. It is ideal for developing predictive models to identify customers at risk of churning.
Columns
The dataset includes 21 columns, offering a range of customer information:
- customerID: A unique identifier for each customer.
- gender: The customer's gender (Male, Female).
- SeniorCitizen: Indicates if the customer is a senior citizen (1: Yes, 0: No).
- Partner: Denotes whether the customer has a partner (Yes, No).
- Dependents: Shows if the customer has dependents (Yes, No).
- tenure: The number of months the customer has remained with the company.
- PhoneService: Specifies if the customer has a phone service (Yes, No).
- MultipleLines: Indicates if the customer has multiple lines (Yes, No, No phone service).
- InternetService: The type of internet service the customer subscribes to (DSL, Fiber optic, No).
- OnlineSecurity: Details if the customer has online security (Yes, No, No internet service).
- OnlineBackup: Shows if the customer uses online backup (Yes, No, No internet service).
- DeviceProtection: Indicates if the customer has device protection (Yes, No, No internet service).
- TechSupport: Specifies if the customer has tech support (Yes, No, No internet service).
- StreamingTV: Notes whether the customer has streaming TV (Yes, No, No internet service).
- StreamingMovies: Details if the customer has streaming movies (Yes, No, No internet service).
- Contract: The customer's contract term (Month-to-month, One year, Two year).
- PaperlessBilling: Shows if the customer uses paperless billing (Yes, No).
- PaymentMethod: The method used for customer payments (Electronic check, Mailed check, Bank transfer, Credit card).
- MonthlyCharges: The amount billed to the customer each month.
- TotalCharges: The overall amount charged to the customer.
- Churn: The target variable, indicating whether the customer churned (Yes, No).
Distribution
This dataset is provided as a single CSV file,
customer_churn_data.csv
, approximately 874.33 kB in size. It contains 5880 records and 21 columns, all of which are valid and have no missing values. The dataset is expected to be updated quarterly.Usage
This dataset is well-suited for a variety of analytical tasks, including:
- Developing and evaluating machine learning models for customer churn prediction.
- Performing exploratory data analysis to uncover patterns and drivers of churn.
- Conducting customer segmentation based on service usage and demographic features.
- Understanding the impact of different services and contract types on customer retention.
Coverage
As synthetic data, this dataset does not have a specific geographic or time range. It represents a diverse set of fictional telecom customers, including distinctions by gender, senior citizen status, partnership status, and dependents. All 5880 records across the 21 columns are free of missing values.
License
CC0: Public Domain
Who Can Use It
- Data Scientists and Machine Learning Engineers: For building and testing churn prediction models.
- Business Analysts: To gain insights into customer behaviour and inform retention strategies.
- Students and Researchers: As a practical resource for learning data cleaning, categorical data analysis, and predictive analytics in a business context.
- Beginners in Data Science: Due to its clear structure and lack of missing data, it's an excellent starting point for data analysis practice.
Dataset Name Suggestions
- Telecom Customer Churn Prediction Data
- Synthetic Churn Analysis Dataset
- Customer Retention Analytics (Telecom)
- Churn Modelling Data for Telecommunications
Attributes
Original Data Source: Telecom Customer Churn Prediction Data