Bank Telemarketing Success Prediction
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed for predicting the success of a bank's direct marketing campaign, specifically those conducted via phone calls [1]. It contains records from a direct marketing effort by a Portuguese banking institution, where multiple calls might be made to a single client [1]. The primary classification goal is to predict whether a client will subscribe to a term deposit (yes/no) [1]. Term deposits are a significant source of revenue for banks, and while telemarketing remains a popular and potentially effective technique due to human-to-human contact, it often requires substantial investment [2]. This dataset aims to help identify patterns and develop future strategies for more effective direct marketing campaigns, enabling businesses to test classification model performance and explore optimal approaches [3, 4]. It is a modified version of a classic bank marketing dataset, with the 'duration' of the phone call feature deliberately excluded to prevent data leakage, as this attribute is not known before a call is made and highly affects the outcome [5, 6].
Columns
The dataset comprises 20 input variables and one output variable:
Bank Client Data:
- age: Client's age (numeric) [6].
- job: Type of job (categorical: 'admin.', 'blue-collar', 'entrepreneur', 'housemaid', 'management', 'retired', 'self-employed', 'services', 'student', 'technician', 'unemployed', 'unknown') [6].
- marital: Marital status (categorical: 'divorced', 'married', 'single', 'unknown'; 'divorced' includes widowed) [6].
- education: Client's education level (categorical: 'basic.4y', 'basic.6y', 'basic.9y', 'high.school', 'illiterate', 'professional.course', 'university.degree', 'unknown') [6, 7].
- default: Has credit in default? (categorical: 'no', 'yes', 'unknown') [7].
- housing: Has housing loan? (categorical: 'no', 'yes', 'unknown') [7].
- loan: Has personal loan? (categorical: 'no', 'yes', 'unknown') [7].
Related to Last Contact of Current Campaign:
- contact: Contact communication type (categorical: 'cellular', 'telephone') [7].
- month: Last contact month of year (categorical: 'jan', 'feb', ..., 'dec') [7].
- day_of_week: Last contact day of the week (categorical: 'mon', 'tue', 'wed', 'thu', 'fri') [7].
Other Attributes:
- campaign: Number of contacts performed during this campaign for the client (numeric, includes last contact) [8].
- pdays: Number of days passed since the client was last contacted from a previous campaign (numeric; 999 means not previously contacted) [8].
- previous: Number of contacts performed before this campaign for the client (numeric) [8].
- poutcome: Outcome of the previous marketing campaign (categorical: 'failure', 'nonexistent', 'success') [8].
Social and Economic Context Attributes:
- emp.var.rate: Employment variation rate - quarterly indicator (numeric) [9].
- cons.price.idx: Consumer price index - monthly indicator (numeric) [9].
- cons.conf.idx: Consumer confidence index - monthly indicator (numeric) [9].
- euribor3m: Euribor 3 month rate - daily indicator (numeric) [9].
- nr.employed: Number of employees - quarterly indicator (numeric) [9].
Output Variable (Desired Target):
- y: Has the client subscribed a term deposit? (binary: 'yes', 'no') [9].
Distribution
This dataset is provided in a CSV format [10]. It contains 41,188 examples (rows) and 20 input features [9, 11]. All columns have 41.2k valid entries, indicating no missing values across the dataset [12-24]. The output variable 'y' shows that approximately 11% of clients subscribed to a term deposit (4,640 'true' instances), while 89% did not (36,548 'false' instances) [23, 24]. The dataset is a copy of
bank-additional-full.csv
from the UCI Machine Learning Repository, with the 'duration' feature removed [5].Usage
This dataset is ideal for:
- Testing and evaluating the performance of classification models [3].
- Exploring optimal strategies to enhance a banking institution's future direct marketing campaigns [3].
- Analyzing patterns within client data and marketing interactions to develop more effective future marketing approaches [4].
- Building predictive models to forecast term deposit subscription success.
Coverage
The dataset focuses on the direct marketing campaigns of a Portuguese banking institution [1]. The data covers a time period from May 2008 to November 2010 [11]. The demographic scope includes bank clients who were contacted via phone calls as part of the marketing efforts [1].
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
- Data scientists and machine learning engineers for developing and benchmarking predictive models.
- Marketing analysts in the financial sector for identifying customer segments and optimising campaign strategies.
- Banking institutions aiming to improve the efficiency and success rates of their telemarketing initiatives.
- Researchers interested in applying data-driven approaches to understand customer behaviour in financial services.
Dataset Name Suggestions
- Bank Telemarketing Success Prediction
- Term Deposit Campaign Outcome
- Portuguese Bank Direct Marketing Data
- Client Subscription Prediction Dataset
Attributes
Original Data Source:Bank Telemarketing Success Prediction