Financial Program Selection Model Data
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Data was created as part of a real-world Data Science Project focused on Investing Program Type Prediction. The primary aim was to assist an investment bank in selecting the correct investment program to offer a specific customer. By predicting the investment program type in advance, the bank was able to avoid the inefficiency of trying to offer both products to the same individual. This dataset originated from a project conducted approximately twelve years ago, when most current state-of-the-art techniques and algorithms were not available. Practitioners are encouraged to use modern feature engineering, modelling, or AutoML tools to try and achieve a high prediction performance, ideally exceeding 90% accuracy.
Columns
The dataset consists of 28 columns, including input features grouped by customer activity and a single target variable:
- SE1 (Age): Customer age. Values below 18 denote that the program was opened for a child. The mean age is 43.5, with ages ranging from 1 to 96.
- SE2 (Geographic location): Customer geographic area. The code G0 signifies that no location information is available.
- BA1 – BA7 (Banking Activity): Seven features representing the money equivalent for general activities conducted on the customer’s bank account in the last year, such as the sum of payments for loan return.
- PE1 – PE15 (Investing History): Fifteen binary flag features. These indicate whether the customer had one of 15 popular investment products or programs during the last year.
- IA1 – IA3 (Investment Activity): Three features representing counts for different types of operations executed on investment accounts in the last year.
- InvType (Target Feature): The target feature (C1 or C0) indicating which Investment Program the customer purchased. The data is limited to customers who bought one of the investment programs.
Distribution
The dataset is provided as a single file in CSV format.
- File Name/Size:
investing_program_prediction_data.csv(438.31 kB). - Records: There are 4,734 valid records in the file.
- Features: The dataset contains 28 columns.
- Data Quality: All features across the records demonstrate 100% validity, with no mismatched or missing data noted.
Usage
This data is perfectly suited for classification challenges, especially within the financial technology sector. Ideal applications include:
- Building and evaluating sophisticated predictive models to forecast customer investment choices.
- Benchmarking the performance of modern machine learning algorithms and techniques against the historical context of the original project.
- Using advanced feature engineering, modelling, or AutoML tools to surpass the historical performance metrics and reach over 90% accuracy.
- Redefining the customer problem for successful selection and offering.
Coverage
The data reflects customer activity and characteristics relating to a project conducted approximately 12 years ago. The banking activity (BA) and investment history (PE, IA) features cover actions performed in the year leading up to that point.
- Demographic Scope: Customer ages range from 1 to 96.
- Geographic Scope: The data includes 90 unique geographic locations.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
This data is valuable for a wide range of users interested in financial modelling:
- Data Scientists: For training and optimising prediction models for financial product recommendations, leveraging current state-of-the-art techniques.
- Academics and Researchers: To assess the substantial performance improvements offered by contemporary methodologies compared to those available a decade ago.
- FinTech Developers: For creating advanced algorithms designed to automate personalised product offerings in investment banking.
Dataset Name Suggestions
- Investing Program Type Prediction
- Bank Customer Investment Classification Data
- Financial Program Selection Model Data
- Historical Investment Choice Predictor
Attributes
Original Data Source:Financial Program Selection Model Data
Loading...
