Bank Telemarketing Prediction
Fraud Detection & Risk Management
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset captures the outcomes of bank marketing campaigns conducted in Portugal, primarily via direct phone calls. The core objective of these campaigns was to encourage bank clients to open a term deposit. The dataset provides various client attributes, details about the last contact of the current campaign, and important social and economic context indicators. The ultimate goal is to predict whether a client will subscribe to a bank term deposit, making it suitable for binary classification tasks. The inclusion of new social and economic features has been shown to significantly enhance prediction accuracy, even when call duration is not considered.
Columns
The dataset contains 20 input attributes and one output attribute:
-
Bank Client Data:
age
: Numeric variable indicating the client's age.job
: Categorical variable detailing the type of job (e.g., "admin.", "blue-collar", "management", "retired", "student", "unknown").marital
: Categorical variable for marital status ("divorced" includes widowed, "married", "single", "unknown").education
: Categorical variable representing the education level (e.g., "high.school", "university.degree", "illiterate", "unknown").default
: Categorical variable indicating if the client has credit in default ("no", "yes", "unknown").housing
: Categorical variable indicating if the client has a housing loan ("no", "yes", "unknown").loan
: Categorical variable indicating if the client has a personal loan ("no", "yes", "unknown").
-
Related to Last Contact of Current Campaign:
contact
: Categorical variable for the communication type used for the last contact ("cellular", "telephone").month
: Categorical variable for the last contact month of the year (e.g., "jan", "feb", "mar", ..., "dec").day_of_week
: Categorical variable for the last contact day of the week ("mon", "tue", "wed", "thu", "fri").duration
: Numeric variable representing the last contact duration in seconds. Note: This attribute heavily influences the output but is not known before a call is made; it should typically be excluded for realistic predictive models.
-
Other Attributes:
campaign
: Numeric variable indicating the number of contacts performed during this campaign for the client (includes the last contact).pdays
: Numeric variable showing the number of days that passed after the client was last contacted from a previous campaign (999 means not previously contacted).previous
: Numeric variable for the number of contacts performed before this campaign for the client.poutcome
: Categorical variable for the outcome of the previous marketing campaign ("failure", "nonexistent", "success").
-
Social and Economic Context Attributes:
emp.var.rate
: Numeric employment variation rate (quarterly indicator).cons.price.idx
: Numeric consumer price index (monthly indicator).cons.conf.idx
: Numeric consumer confidence index (monthly indicator).euribor3m
: Numeric euribor 3 month rate (daily indicator).nr.employed
: Numeric number of employees (quarterly indicator).
-
Output Variable (Target):
y
: Binary variable indicating if the client has subscribed to a term deposit ("yes", "no").
Missing values in several categorical attributes are labelled as "unknown".
Distribution
The dataset is provided in a CSV format, specifically
bank-additional-full.csv
. It contains 41,188 instances (rows). The file can be easily read into analytical tools using common data import functions, for example, in R: d=read.table("bank-additional-full.csv",header=TRUE,sep=";")
.Usage
This dataset is ideal for:
- Predicting bank telemarketing success.
- Developing binary classification models to identify clients likely to subscribe to a term deposit.
- Benchmarking different machine learning algorithms.
- Conducting exploratory data analysis to understand factors influencing term deposit subscriptions.
Coverage
The dataset focuses on Portugal, reflecting bank marketing campaigns within that country. The added social and economic features are national indicators for a country with approximately 10 million inhabitants. The data includes various time-based indicators (daily, monthly, quarterly) to capture relevant socio-economic context.
License
CC BY-NC-SA 4.0
Who Can Use It
- Researchers interested in data-driven approaches to marketing and financial services.
- Data scientists and machine learning engineers building predictive models.
- Banks and financial institutions aiming to optimise their marketing strategies.
- Students learning about classification tasks and real-world data analysis.
Dataset Name Suggestions
- Bank Marketing Campaign Outcomes
- Portugal Term Deposit Subscription Data
- Bank Telemarketing Prediction
- Client Deposit Campaign Analytics
Attributes
Original Data Source: Bank Telemarketing Prediction