Machine Learning Ethiopian Vehicle Insurance
Finance & Banking Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset contains vehicle insurance information collected in Ethiopia, intended for machine learning applications. It provides valuable data for understanding the dynamics of car insurance policies, which are a legal requirement in many regions. The data allows for the development of algorithms and analytical models to explore various aspects of vehicle insurance.
Columns
- SEX: An indicator for the policyholder's sex.
- INSR_BEGIN: The commencement date of the insurance policy.
- INSR_END: The expiry date of the insurance policy.
- EFFECTIVE_YR: The year the insurance policy became effective or the vehicle's effective year.
- INSR_TYPE: The specific classification or category of the insurance policy.
- INSURED_VALUE: The monetary value at which the vehicle was insured.
- PREMIUM: The amount charged for the insurance policy.
- OBJECT_ID: A unique identifier for each insurance record or object.
- PROD_YEAR: The manufacturing year of the insured vehicle.
- SEATS_NUM: The total number of seats available in the vehicle.
- CARRYING_CAPACITY: The load-carrying capability of the vehicle.
- TYPE_VEHICLE: The general classification of the vehicle, such as 'Truck' or 'Pick-up'.
- CCM_TON: Information relating to the engine's cubic capacity or the vehicle's tonnage.
- MAKE: The brand or manufacturer of the vehicle, with 'TOYOTA' being the most common.
- USAGE: The primary declared use of the vehicle, such as 'Own Goods' or 'Private'.
- CLAIM_PAID: The monetary amount disbursed for an insurance claim.
Distribution
The dataset is structured across two CSV files, both sharing the same column headers but containing distinct values. It encompasses approximately 294,000 records. One of the files, 'motor_data11-14lats.csv', has a size of 29.24 MB. It is important to note that some columns, such as 'CARRYING_CAPACITY' (28% missing) and 'CLAIM_PAID' (92% missing), have a notable number of absent values.
Usage
This dataset is well-suited for various machine learning tasks and analytical pursuits, including:
- Developing models for predicting future insurance claim amounts.
- Performing risk assessment based on vehicle characteristics and policy details.
- Analysing factors that influence insurance premium calculations.
- Identifying patterns for potential insurance fraud detection.
- Gaining insights into driver and policyholder behaviour.
Coverage
- Geographic: The data was acquired from the Ethiopian Insurance Corporation, indicating a focus on vehicle insurance within Ethiopia.
- Time Range: Insurance policy start dates within the dataset range from 1st July 2011 to 30th June 2014. Policy end dates span from 13th July 2011 to 29th June 2015. The production years of the vehicles included in the data extend from 1950 to 2018.
- Demographic Scope: The dataset includes a 'SEX' column, which can be used to analyse policy characteristics and outcomes across different genders.
License
Attribution 4.0 International (CC BY 4.0)
Who Can Use It
This data product is particularly useful for:
- Machine Learning Engineers: For building and testing predictive models for insurance outcomes.
- Data Scientists: For exploratory data analysis, feature engineering, and extracting business intelligence.
- Actuarial Scientists: For refining risk models and optimising premium structures.
- Academic Researchers: For studies on vehicle insurance markets, risk management, and statistical modelling.
- Students: For practical application in data analysis and machine learning coursework.
Dataset Name Suggestions
- Ethiopian Motor Insurance Dataset
- Ethiopian Vehicle Risk Data
- Machine Learning Ethiopian Vehicle Insurance
- Ethiopian Car Insurance Analytics
Attributes
Original Data Source: Machine Learning Ethiopian Vehicle Insurance