Human-Crafted Factuality Evaluation Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
FACTS Grounding 1.0 is a robust benchmark designed by Google DeepMind and Google Research. Its primary purpose is to assess the performance of AI models concerning factuality and grounding. The dataset features a collection of 860 public examples, all meticulously crafted by humans, specifically for evaluating how effectively an AI system grounds its answers based solely on provided context. Each example is structured with several key components, including specific instructions for the model, the user’s exact question or request, and a long document containing the information needed to form a factual response. This product also includes evaluation prompts necessary for judging the responses generated by models.
Columns
The dataset is structured around the following key fields within each example:
- system_instruction: Provides general guidelines and constraints to the model, often instructing it to base answers strictly on the context provided.
- user_request: Contains the specific task or question that the AI system is required to answer, such as queries about financial tips.
- context_document: A lengthy source document, potentially including materials like SEC filings for public companies, which holds all the facts necessary to answer the associated question.
Distribution
This dataset contains 860 public FACTS Grounding examples. It is distributed in formats associated with both Tabular and Text data structures. Details regarding specific file sizes or precise record counts for the initial release are available in associated technical reports.
Usage
This benchmark is ideal for measuring the accuracy and reliability of AI models. It is specifically useful for rigorous testing of model performance in scenarios requiring deep contextual grounding, ensuring that generated answers are derived directly and only from the provided source information. It supports the development of models that adhere strictly to factual evidence.
Coverage
The data originates from projects established by Google DeepMind and Google Research. The dataset is expected to be updated annually. Specific geographic or demographic restrictions are not detailed, but the contents often involve complex documents relevant to various topics, such as financial regulatory documents.
License
Creative Commons Attribution 4.0 International License (CC-BY)
Who Can Use It
- AI Researchers and Engineers: Utilise the data for evaluating and improving the grounding capabilities of large language models.
- Academics: Study factuality metrics and the development of reliable AI systems.
- Machine Learning Developers: Test the robustness and accuracy of new AI architectures against a difficult, human-curated factuality standard.
Dataset Name Suggestions
- FACTS Grounding AI Benchmark 1.0 Public Examples
- Google Research AI Grounding Set
- Human-Crafted Factuality Evaluation Data
Attributes
Original Data Source: Human-Crafted Factuality Evaluation Data