Opendatabay APP

Language Model Truthfulness Dataset

Data Science and Analytics

Tags and Keywords

Nlp

Data

Cleaning

Text

Mining

Natural

Disasters

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Language Model Truthfulness Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is specifically designed to evaluate the truthfulness of language models when generating answers to a wide array of questions. Its primary purpose is to uncover any erroneous or false responses that might arise due to incorrect beliefs or common misconceptions. It serves as a key measure of a language model's ability to go beyond merely imitating human texts and to avoid producing inaccurate information. The dataset comprises 817 carefully constructed questions covering diverse topics such as health, law, finance, and politics.

Columns

The dataset primarily includes two files: 'generation_validation.csv' and 'multiple_choice_validation.csv'.
For 'generation_validation.csv':
  • type: The format or style of the question (Categorical).
  • category: The topic associated with each question (Categorical).
  • best_answer (or correct_answer): The accurate and truthful response to each question (Text).
  • correct_answers (or incorrect_answers): Lists all correct (or incorrect) truthful answers humans are likely to provide (Text).
  • source: The origin from where each question was derived (Text).
For 'multiple_choice_validation.csv':
  • type: The type or format of the question (Categorical).
  • mc1_targets, mc2_targets, etc.: The expected answer choices for multiple-choice questions according to option one, option two, and so forth (Categorical).

Distribution

The dataset is provided in CSV format, specifically as 'generation_validation.csv' and 'multiple_choice_validation.csv' files. It contains 817 questions in total. The structure includes questions and answers generated by language models, which are then evaluated for their truthfulness. It also encompasses multiple-choice questions for validation. Specific numbers for rows or records within each file are not detailed.

Usage

This dataset can be ideally applied for several purposes:
  • Training and evaluating language models: Assess the truthfulness of language models in generating answers by comparing generated outputs with the provided correct responses.
  • Detecting misinformation: Develop algorithms or models capable of identifying false or misleading information.
  • Improving fact-checking systems: Enhance the accuracy of fact-checking platforms by using this dataset for training and validating algorithms.
  • Understanding human misconceptions: Analyse incorrect human responses within the dataset to gain insights into common false beliefs across various topics.
  • Investigating biases in language models: Examine potential biases present within generative language models concerning specific subjects.

Coverage

The dataset is global in its potential application and scope. While a specific time range for data collection is not detailed, the dataset was listed on 24th June 2025. It indirectly covers demographic scope by including questions designed to expose common false beliefs or misconceptions held by some individuals, offering insights into these areas.

License

CC0

Who Can Use It

This dataset is intended for:
  • Researchers: For evaluating and improving language models.
  • Developers: Those creating algorithms to identify misinformation.
  • Fact-checking organisations: To enhance the accuracy of their systems.
  • Academics and analysts: For studying human misconceptions and biases in AI.

Dataset Name Suggestions

  • TruthfulQA Benchmark
  • Language Model Truthfulness Dataset
  • AI Answer Truth Evaluation
  • Question Answering Factual Accuracy Dataset

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

24/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format