Dark Mode

Home

Data Categories

AI & ML Data

Fraudulent Job Posting Detection Dataset

FREE DATASET LIBRARY

Verified Data Provider

£0

Fraudulent Job Posting Detection Dataset

Fraud Detection & Risk Management

Tags and Keywords

Text

Nlp

Jobs

Binary

Employment

Trusted By

Fraudulent Job Posting Detection Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for the prediction of real or fake job postings, addressing the growing concern of fraudulent job descriptions in the online sphere. It contains a collection of 18,000 job descriptions, a notable portion of which are identified as fraudulent—approximately 800 entries. The data includes both detailed textual information from the job descriptions themselves and various meta-information pertaining to the jobs. It serves as a valuable resource for developing machine learning models capable of classifying job descriptions as either legitimate or deceptive. Furthermore, the dataset can be utilised for identifying distinctive traits and features, such as specific words, entities, or phrases, that are characteristic of fraudulent job postings. Researchers and developers can also leverage this dataset to run contextual embedding models for identifying similar job descriptions or to perform exploratory data analysis to uncover interesting insights related to employment fraud.

Columns

This dataset is composed of columns that capture both textual and structured meta-information about job postings. As no specific data sample with column headers was provided, the following columns are inferred based on the dataset's stated purpose and the nature of job advertisements:

Job ID: A unique identifier for each individual job posting.
Title: The advertised job title (e.g., 'Marketing Intern', 'Head of Content').
Location: Geographic details of the job, which may include 'Country', 'State', and 'City'.
Department: The specific department within the organisation where the role is situated.
Salary Range: The indicated remuneration for the position, typically an annual salary or hourly wage.
Company Profile: A descriptive overview of the hiring company.
Job Description: The detailed narrative of the role, encompassing responsibilities, qualifications, and benefits.
Requirements/Qualifications: Specific skills, prior experience, and educational background necessary for the role.
Employment Type: The nature of employment (e.g., 'Full-time', 'Part-time', 'Internship').
Experience Level: The required seniority or experience for the position (e.g., 'Entry-level', 'Mid-Senior level').
Education Required: The minimum educational qualification expected from candidates.
Industry: The sector in which the hiring company operates.
Function: The primary professional function of the role (e.g., 'Sales', 'Customer Service', 'Marketing').
Is Fake: A binary flag (e.g., 0 or 1) indicating whether the job posting is genuine or fraudulent, serving as the target variable for classification tasks.

Distribution

The dataset comprises 18,000 job descriptions, with approximately 800 of these identified as fraudulent. The data is typically provided in a CSV file format, a common standard for structured datasets. It incorporates a blend of textual content and meta-information for each job posting. Specific figures for file size are not available, but the volume of records makes it a substantial resource for analysis. Sample files would usually be updated separately to the platform.

Usage

This dataset is ideally suited for various applications, including:

Fraud Detection Models: Developing classification models to accurately predict whether a job description is fraudulent or real.
Feature Identification: Pinpointing key characteristics, such as specific words, entities, or phrases, that are indicative of fraudulent job postings.
Semantic Analysis: Running contextual embedding models to identify job descriptions that are semantically similar.
Exploratory Data Analysis (EDA): Performing in-depth analysis to uncover insightful patterns and trends within the job market and fraud landscape.

Coverage

The dataset's geographical scope is global. While specific time ranges for the job postings themselves are not explicitly detailed, the dataset was listed on 05/06/2025. There are no specific notes on demographic scope beyond its relevance to employment data. The dataset includes 18,000 job descriptions, with 800 confirmed as fake, providing a clear availability for both legitimate and fraudulent examples.

License

CCO

Who Can Use It

This dataset is particularly useful for:

Data Scientists and Machine Learning Engineers: For building and testing fraud detection and text classification models.
Researchers: To study patterns in online recruitment fraud and develop new detection methodologies.
Job Board Platforms and HR Technology Companies: To implement automated systems for identifying and flagging suspicious job postings, enhancing platform integrity.
Analysts: For performing exploratory data analysis to gain insights into employment trends and fraudulent activities.

Dataset Name Suggestions

Fraudulent Job Posting Detection Dataset
Job Scam Identification Data
Employment Fraud Classification Dataset
Deceptive Job Description Dataset
Real and Fake Job Posting Data

Attributes

Original Data Source: Real / Fake Job Posting Prediction

Listing Stats

VIEWS

252

DOWNLOADS

LISTED

05/06/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

FREE DATASET LIBRARY

£0

Fraudulent Job Posting Detection Dataset

Fraud Detection & Risk Management

Tags and Keywords

Text

Nlp

Jobs

Binary

Employment

Trusted By

Free

About

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Listing Stats

Free

Download Dataset in CSV Format

RECOMMENDED DATASETS