Dark Mode

Home

Data Categories

Web & Social Media Data

Examiner Content Farm Data

FREE DATASET LIBRARY

Verified Data Provider

£0

Examiner Content Farm Data

News & Media Articles

Tags and Keywords

News

Headlines

Journalism

Clickbait

Examiner

Trusted By

Examiner Content Farm Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset presents a unique archive of crowdsourced journalism from "The Examiner", a significant pseudo-news website from the 2000s digital content landscape. It contains the headlines of over 3 million articles penned by approximately 21,000 authors over six years. The Examiner, though not acclaimed for its quality, was remarkably prolific, generating thousands of articles daily and reaching its peak in 2011 with high search rankings, enormous social media shares, and up to twenty million unique mobile visitors monthly. This collection offers a vivid portrayal of trending topics during its operational period and serves as the last surviving record of a once prominent, advert-revenue-driven digital platform whose original content is now defunct.

Columns

publish_date: The date when the article was published on The Examiner site, formatted as yyyyMMdd.
headline_text: The actual text of the article's headline, presented in English.

Distribution

The dataset is provided in CSV format and contains 3,089,781 unique items or records. The file size is approximately 202.69 MB. While a precise breakdown of records per year is not available for all periods, the dataset spans from early 2010 to late 2015, with varying article counts across different date ranges within that period. All records are valid, with no missing or mismatched entries for either the publish date or headline text.

Usage

This dataset is ideal for:

Analysing trends in digital content and journalism over a six-year period.
Studying the evolution and impact of catchy headlines and clickbait strategies.
Researching the characteristics of crowdsourced news and content farms.
Applications in Natural Language Processing (NLP), such as text analysis, topic modelling, or sentiment analysis on news headlines.
Exploring historical data related to online media consumption and popular topics.

Coverage

The dataset primarily covers content published by The Examiner, an online platform. The articles span a time range from 1st January 2010 to 31st December 2015. While specific geographic or demographic data on authors or readers is not explicitly detailed, the content originates from a US-based pseudo-news site, suggesting a primary focus on topics relevant to that region. The dataset contains contributions from around 21,000 authors.

License

CC0: Public Domain

Who Can Use It

Researchers and Academics: For studies in media history, digital journalism, content monetisation strategies, and social media trends.
Data Scientists and NLP Practitioners: For developing and testing algorithms related to text classification, topic extraction, and understanding headline virality.
Journalism and Communication Scholars: To examine the quality and characteristics of high-volume, advert-driven online content.
Content Strategists: To gain insights into historical content performance and headline effectiveness.

Dataset Name Suggestions

The Examiner News Headlines Archive
Digital Journalism: Examiner Headlines (2010-2015)
Crowdsourced News Headline Catalog
Examiner Content Farm Data
Clickbait Chronicles: The Examiner

Attributes

Original Data Source: Examiner Content Farm Data

Listing Stats

VIEWS

DOWNLOADS

LISTED

11/08/2025

REGION

GLOBAL

QUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format

Recommended Datasets

Loading recommendations...