Opendatabay APP

El Espectador Daily Tweets

Telecommunications & Network Data

Tags and Keywords

Tabular

Text

Nlp

Trusted By
Trusted by company1Trusted by company2Trusted by company3
El Espectador Daily Tweets Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is a compilation of tweets in Spanish from the Colombian newspaper El Espectador. It was initially created in 2019 as a practical exercise for Microsoft's Power Automate and Power BI data streaming. The primary purpose of this dataset is to facilitate text mining and natural language processing tasks, having been specifically used for building a co-occurrence network of words from tweets in Databricks with PySpark. It includes the original tweet texts, preserving emojis and URLs as published.

Columns

  • TweetText: This column contains the full text of tweets published by the @elespectador account.
  • CreatedAt: This column provides the datetime when each tweet was published.

Distribution

The dataset is presented in a tabular format. While specific total row or record counts are not available, the TweetText and CreatedAt columns indicate 53,576 unique values and 44,962 total values respectively, suggesting the scale of the compiled tweets.

Usage

This dataset is ideal for text mining and natural language processing (NLP) applications. Specific use cases include:
  • Developing text analysis models.
  • Building co-occurrence networks of words.
  • Training natural language understanding (NLU) and natural language generation (NLG) models.
  • Analysing social media content and trends related to news.

Coverage

The dataset's geographic scope is Colombia, as it features tweets from a Colombian newspaper. The data collection began in 2019. The content is exclusively in Spanish.

License

CC BY-NC-SA

Who Can Use It

This dataset is suitable for a variety of users, including:
  • Data scientists and machine learning engineers working on NLP problems.
  • Academic researchers in linguistics, social sciences, or computational journalism.
  • Students learning about text mining and big data processing.
  • Anyone interested in analysing social media discourse from a specific regional news source.

Dataset Name Suggestions

  • Tweets from El Espectador
  • Colombian News Tweets Archive
  • El Espectador Tweet Compilation
  • Spanish Newspaper Social Media Data
  • El Espectador Daily Tweets

Attributes

Original Data Source: Tweets From El Espectador

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format