Insurance Fraud Claims Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset focuses on insurance claims fraud detection, offering a vital resource for uncovering fraudulent activities within the insurance sector. It comprises three interconnected datasets: Employee Data, which serves as the master record for agents and adjusters handling claims; Vendor Data, detailing the master information of vendors assisting insurance companies with investigations; and Claims Data, which captures the granular transaction details of claims submitted by customers for reimbursement. The primary purpose is to facilitate the identification of suspicious patterns and behaviours related to insurance claims.
Columns
The 'Employee Data' file, typically in CSV format, includes the following columns:
- AGENT_ID: A unique identifier for each insurance agent.
- AGENT_NAME: The full name of the agent.
- DATE_OF_JOINING: The date when the agent commenced employment with the organisation, ranging from 25th June 1990 to 24th June 2018.
- ADDRESS_LINE1: The first line of the agent's home address.
- ADDRESS_LINE2: The second line of the agent's home address; often missing.
- CITY: The city of the agent's home address, with notable examples like Arvada and Washington.
- STATE: The state of the agent's home address, with California (CA) and Colorado (CO) being frequently occurring states.
- POSTAL_CODE: The postal code of the agent's home address.
- EMP_ROUTING_NUMBER: The bank routing number associated with the agent.
- EMP_ACCT_NUMBER: The bank account number associated with the agent.
Distribution
The dataset is presented in a data file, typically in CSV format. The Employee Data file, for example, is approximately 129.54 kB in size and contains 1200 records. While specific row counts for the Vendor and Claims datasets are not explicitly provided, the entire collection consists of three distinct datasets.
Usage
This dataset is ideally suited for:
- Claim Level Fraud Detection: Identifying fraudulent claims submitted by customers.
- Employee Fraud Detection: Uncovering instances of fraud perpetrated by employees or agents.
- Employee Vendor Collusion: Detecting cases where employees and vendors collaborate in fraudulent schemes.
Coverage
The dataset's time range for employee joining dates spans from 25th June 1990 to 24th June 2018. Geographic coverage is indicated by various US cities and states in the agent address information, such as Arvada, Washington, California, and Colorado, suggesting a focus within the United States. No specific notes on data availability for certain demographic groups are provided.
License
CC0: Public Domain
Who Can Use It
This dataset is highly beneficial for data scientists, fraud analysts, insurance companies, and researchers aiming to develop and refine models for fraud detection within the insurance industry. It supports use cases ranging from identifying suspicious claim behaviours to uncovering internal malfeasance.
Dataset Name Suggestions
- Insurance Fraud Claims Dataset
- Employee and Vendor Insurance Fraud Data
- Claims Fraud Detection Suite
- Insurance Industry Fraud Analytics Data
Attributes
Original Data Source: Insurance Fraud Claims Dataset