Opendatabay APP

News Category Dataset

Entertainment & Media Consumption

Tags and Keywords

News

Beginner

Text

NLP

Trusted By
Trusted by company1Trusted by company2Trusted by company3
News Category Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is a follow-up to the News Category Dataset. It contains 45.5k news headlines from the year 2012 to 2018 obtained from HuffPost. The motive was to give beginners an easy-to-use dataset. Therefore dataset has been cleaned, filtered and target feature have been balanced, unlike the original dataset.
Content This data contains:
45500 rows and 5 columns Target column: Category ( Business , Politics, Food & Drink, TRAVEL ,Parenting, STYLE & BEAUTY ,Wellness, World news, Sports , Entertainment) -Each category class contains 4500 rows -It contains nan values only in keywords column Apart from that, the original dataset had lots of third person statements (like "This statement is irrelevant" says the officials) -Keyword column has been added where main keywords in a url are extracted (urls were in the original dataset) Inspiration I found the original dataset hard to work with. So i cleaned the dataset and made a more easy-to-use dataset. Hope it helps fellow beginners getting started with NLP !!
Original Data Source: News Category Dataset

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

11/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free