Historical Question-Answering Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
A collection of questions and answers from the SAT Subject Test, focusing on World History and US History. This text classification dataset provides each question along with multiple-choice options and the correct response. The questions cover a variety of topics, time periods, and regions relevant to both historical subjects, making it a valuable resource for training language models and other natural language processing applications.
Columns
- id: A unique numerical identifier for each question.
- subject: The SAT subject category, either 'World History' or 'US History'.
- prompt: The full text of the question itself.
- A: The text for answer option A.
- B: The text for answer option B.
- C: The text for answer option C.
- D: The text for answer option D.
- E: The text for answer option E.
- answer: The letter corresponding to the correct answer for the question.
Distribution
The dataset is provided in a single CSV file named
sat_world_and_us_history.csv
with a size of approximately 471.12 kB. It contains 1380 records and 9 columns.Usage
This dataset is ideal for training and evaluating large language models (LLMs) and other machine learning models. Key applications include text classification, question-answering systems, and general natural language processing (NLP) tasks. It can be used for developing educational tools, exam preparation software, or for academic research in text mining and language modelling.
Coverage
The dataset's content covers topics within World History and US History as they appear on the SAT Subject Test. Geographically, it spans various global regions and time periods relevant to these subjects. Demographically, it is targeted towards students and educators involved with standardised testing for university and college admissions.
License
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Who Can Use It
- Data Scientists and ML Engineers can use this dataset to train, fine-tune, and benchmark text classification and question-answering models.
- NLP Researchers can explore language modelling, text mining, and other natural language processing tasks.
- EdTech Developers can build applications for exam preparation, tutoring systems, and educational content generation.
- Academic Institutions can use it for research in education, history, and computational linguistics.
Dataset Name Suggestions
- SAT History Questions & Answers for LLM Training
- US & World History SAT Test Questions
- Text Classification Dataset: SAT History
- Historical Question-Answering Dataset (SAT)
Attributes
Original Data Source: Historical Question-Answering Dataset