Opendatabay APP

Yahoo Answers 10 categories for NLP CSV

Art & Digital Creations

Related Searches

Classification

NLP

Multiclass Classification

Text

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Yahoo Answers 10 categories for NLP CSV Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

The Yahoo! Answers topic classification dataset is constructed using 10 largest main categories. Each class contains 140,000 training samples and 6,000 testing samples. Therefore, the total number of training samples is 1,400,000 and testing samples 60,000 in this dataset. From all the answers and other meta-information, we only used the best answer content and the main category information.
The file classes.txt contains a list of classes corresponding to each label.
The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 4 columns in them, corresponding to class index (1 to 10), question title, question content and best answer. The text fields are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is "\n".

Listing Stats

VIEWS

11

DOWNLOADS

0

LISTED

05/06/2025

REGION

GLOBAL

UDQSSQUALITY

5 / 5

VERSION

1.0

Free