£0

Cosmos QA: Commonsense Reading Comprehension

Data Science and Analytics

Tags and Keywords

Text

Nlp

Mining

Classification

Pre-processing

Trusted By

Cosmos QA: Commonsense Reading Comprehension Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is a large-scale collection of problems designed for commonsense-based reading comprehension. It comprises 35.6K multiple-choice questions focusing on interpreting everyday narratives to infer likely causes or effects, extending beyond explicit text mentions. This resource is valuable for developing and evaluating advanced models, potentially enhancing performance in real-world applications. It can improve and customise question answering systems for educational or customer service applications, and also aids in studying human narrative processing to inform artificial intelligence system design. The test.csv file within this dataset is specifically structured to evaluate a model's performance on these commonsense tasks.

Columns

context: The narrative context for the question (String).
question: The question posed related to the context (String).
answer0: The first answer option (String).
answer1: The second answer option (String).
answer2: The third answer option (String).
answer3: The fourth answer option (String).
label: The correct answer to the question (String).
id: A unique identifier for the question (String).

Distribution

The dataset consists of 35.6K problems, formulated as multiple-choice questions. It is typically provided in CSV file format. The dataset is structured across several files, including validation.csv, train.csv, and test.csv, each containing the core data columns.

Usage

Developing and evaluating commonsense-based reading comprehension models.
Improving and customising question answering systems, particularly for educational or customer service applications.
Conducting research into how human beings process and understand narratives, to better design artificial intelligence systems.

Coverage

The dataset focuses on diverse collections of people's everyday narratives. While specific geographic, time range, or demographic details are not provided within the narrative content, the data is broadly applicable to general human experiences.

License

CC0

Who Can Use It

Data scientists and machine learning engineers: For building and evaluating natural language processing (NLP) models, especially for question answering and reading comprehension.
Researchers in artificial intelligence: Those interested in improving AI systems' ability to understand and reason with commonsense knowledge.
Developers of educational technology: For creating or enhancing automated question-answering tools for learning platforms.
Customer service solution providers: For developing more intelligent chatbots or virtual assistants capable of nuanced understanding.

Dataset Name Suggestions

Cosmos QA: Commonsense Reading Comprehension
Everyday Narratives Question Answering Dataset
AI Commonsense Reasoning Dataset
Narrative Intelligence QA

Attributes

Original Data Source: Cosmos QA (Commonsense QA)

Listing Stats

VIEWS

DOWNLOADS

LISTED

27/06/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0