Bank Term Deposit Subscription Predictor
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides insights into direct marketing campaigns conducted by a Portuguese banking institution, primarily through phone calls [1]. Its main objective is to predict customer subscription to term deposits based on various customer attributes and campaign interactions [1, 2]. By analysing historical data, this dataset facilitates the identification of patterns that predict future subscription behaviour, enabling businesses to understand their target audience and optimise marketing efforts [2, 3].
Columns
The dataset includes various features relating to customers and their engagement with the bank's marketing efforts [4, 5]. It contains both numerical and categorical data for training and testing predictive models [3, 6, 7].
- age: The age of the customer (Numerical) [4-6]. Mean age is 41.2 years [8].
- job: The occupation or employment status of the customer (Categorical) [4-6]. The most common occupations are management and blue-collar, each accounting for 21% of the records [8].
- marital: The marital status of the customer (Categorical) [4-6]. The majority of customers are married (62%) [9].
- education: The education level attained by the customer (Categorical) [4, 5, 10]. Secondary education is the most common at 51%, followed by tertiary at 30% [9].
- default: Indicates whether the customer has credit in default (Categorical, Boolean) [4, 5, 10]. 98% of customers do not have credit in default [9].
- balance: The balance in the customer's account (Numerical) [4, 10, 11]. The mean balance is approximately 1.42k [12].
- housing: Indicates whether the customer has a housing loan (Categorical, Boolean) [4, 10, 11]. 57% of customers have a housing loan [13].
- loan: Indicates whether the customer has a personal loan (Categorical, Boolean) [13]. 85% of customers do not have a personal loan [13].
- contact: The type of communication used to contact customers, such as phone or cellular (Categorical) [4, 10, 11]. Cellular is the most common contact method (64%) [13].
- day: The day of the month when the last contact with customers was made (Numerical) [2, 10, 14]. The mean day is 15.9 [14].
- month: The month of the last contact with customers (Categorical) [14]. May (31%) and July (16%) are the most common months for contact [14].
- duration: The duration in seconds of the last contact with customers during the campaign (Numerical) [2, 10, 14]. The mean duration is 264 seconds [15].
- campaign: The number of contacts performed during this campaign for each customer (Numerical) [2, 15]. The mean number of contacts per campaign is 2.79 [16].
- pdays: The number of days passed since the customer was previously contacted from a previous campaign (Numerical) [2, 7, 16]. Many records show -1, indicating no previous contact [17].
- previous: The number of contacts performed before this campaign for each customer (Numerical) [17]. The mean number of previous contacts is 0.54 [17].
- poutcome: The outcome from the previous marketing campaign (Categorical) [2, 7, 17]. The outcome is unknown for 82% of contacts, with failure at 11% [18].
- y: The target variable, indicating whether the customer subscribed to a term deposit (Categorical, Boolean) [18]. 12% of customers subscribed to a term deposit [18].
Distribution
The dataset is provided in CSV format and includes both
train.csv
and test.csv
files [6, 7, 19]. The test.csv
file, for which detailed column statistics are provided, contains 4,521 valid records across its 17 columns [8, 9, 12-18, 20]. Specific row counts for the entire dataset, beyond the test data, are not available in the provided sources.Usage
This dataset is ideal for various applications, including:
- Predictive Modelling: Building models to forecast whether a customer will subscribe to a term deposit [11].
- Customer Segmentation: Grouping customers based on characteristics and behaviour to tailor marketing strategies [21].
- Campaign Optimisation: Analysing the effectiveness of communication types, contact frequency, and previous campaign outcomes to improve future marketing efforts [21].
Coverage
The data originates from direct marketing campaigns conducted by a Portuguese banking institution [1]. The dataset covers various customer demographics, including age, occupation, marital status, and education levels [4, 5]. Specific time ranges for the campaigns are not detailed in the provided sources.
License
CC0- Public Domain
Who Can Use It
This dataset is particularly valuable for:
- Businesses in similar domains aiming to understand their target audience and optimise marketing [3].
- Machine learning practitioners for building and evaluating predictive models [2, 3].
- Data scientists and analysts for customer segmentation and behavioural insights [21].
- Marketing strategists looking to enhance campaign effectiveness and return on investment [21].
Dataset Name Suggestions
- Bank Term Deposit Subscription Predictor
- Customer Marketing Campaign Success
- Portuguese Bank Deposit Campaign Outcomes
- Direct Marketing Customer Behaviour
Attributes
Original Data Source: Bank Term Deposit Subscription Predictor