Cosmos QA: Commonsense Reading Comprehension
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is a large-scale collection of problems designed for commonsense-based reading comprehension. It comprises 35.6K multiple-choice questions focusing on interpreting everyday narratives to infer likely causes or effects, extending beyond explicit text mentions. This resource is valuable for developing and evaluating advanced models, potentially enhancing performance in real-world applications. It can improve and customise question answering systems for educational or customer service applications, and also aids in studying human narrative processing to inform artificial intelligence system design. The
test.csv
file within this dataset is specifically structured to evaluate a model's performance on these commonsense tasks.Columns
- context: The narrative context for the question (String).
- question: The question posed related to the context (String).
- answer0: The first answer option (String).
- answer1: The second answer option (String).
- answer2: The third answer option (String).
- answer3: The fourth answer option (String).
- label: The correct answer to the question (String).
- id: A unique identifier for the question (String).
Distribution
The dataset consists of 35.6K problems, formulated as multiple-choice questions. It is typically provided in CSV file format. The dataset is structured across several files, including
validation.csv
, train.csv
, and test.csv
, each containing the core data columns.Usage
- Developing and evaluating commonsense-based reading comprehension models.
- Improving and customising question answering systems, particularly for educational or customer service applications.
- Conducting research into how human beings process and understand narratives, to better design artificial intelligence systems.
Coverage
The dataset focuses on diverse collections of people's everyday narratives. While specific geographic, time range, or demographic details are not provided within the narrative content, the data is broadly applicable to general human experiences.
License
CC0
Who Can Use It
- Data scientists and machine learning engineers: For building and evaluating natural language processing (NLP) models, especially for question answering and reading comprehension.
- Researchers in artificial intelligence: Those interested in improving AI systems' ability to understand and reason with commonsense knowledge.
- Developers of educational technology: For creating or enhancing automated question-answering tools for learning platforms.
- Customer service solution providers: For developing more intelligent chatbots or virtual assistants capable of nuanced understanding.
Dataset Name Suggestions
- Cosmos QA: Commonsense Reading Comprehension
- Everyday Narratives Question Answering Dataset
- AI Commonsense Reasoning Dataset
- Narrative Intelligence QA
Attributes
Original Data Source: Cosmos QA (Commonsense QA)