Opendatabay APP

GPT-4o Validated Vehicle Existence Set

Product Reviews & Feedback

Tags and Keywords

Automotive

Validation

Gpt4o

Pricing

Vehicles

Trusted By
Trusted by company1Trusted by company2Trusted by company3
GPT-4o Validated Vehicle Existence Set Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Data provides expert validation regarding the existence of specific car brand, model, and year combinations in the real world. This information is essential for improving the quality of input data used in vehicle price regression models. The validation was performed by a dedicated vehicle expert system utilising GPT-4o, guided by a sophisticated prompt to ensure highly factual responses on vehicle authenticity.

Columns

The dataset includes three distinct columns:
  • id: A unique identifier for each vehicle entry, concatenating the Brand, Model, and Year (e.g., Porsche- Camry Solara SE- 2005). This field contains 28,168 unique entries.
  • exists: A boolean field indicating whether the specified car-model-year combination exists or not. Validation results show that approximately 57% of entries are 'true', while 43% are 'false'.
  • comment: Contains additional findings or context regarding the validation, often indicating the closest real-world match when a vehicle does not exist.

Distribution

The data is available in CSV format, suitable for immediate platform integration. It consists of 3 columns and totals 28.2 thousand records. The file size is 1.74 MB. It is expected that the data will be updated daily.

Usage

This dataset is ideally suited for machine learning applications, particularly in the automotive domain. Primary uses include data cleaning, feature engineering for car price regression models, and verifying the integrity of vehicle inventory lists by filtering combinations that do not exist. It is valuable for ensuring the input features of predictive models are based on actual, validated vehicle specifications.

Coverage

The scope includes a variety of vehicle brands, models, and years that were validated through the GPT-4o system. The factual accuracy of the existence claim is based on the expert knowledge base of the large language model used. There is no explicit geographic or demographic scope associated with this validation set, as it focuses purely on vehicle existence facts.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For training and testing machine learning models focused on the automotive industry.
  • Automotive Analysts: For quality checking and preparing large vehicle datasets before market analysis.
  • Developers: Those building software applications requiring verified, real-world vehicle configurations.

Dataset Name Suggestions

  • GPT-4o Validated Vehicle Existence Set
  • Automotive Data Integrity Check
  • ML Vehicle Validation Data
  • Car Model Year Reality Check

Attributes

Listing Stats

VIEWS

3

DOWNLOADS

0

LISTED

31/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format