Opendatabay APP

Lusophone Market News Sentiment Analysis Dataset

Stock & Market Data

Tags and Keywords

Sentiment

Portuguese

Finance

Text

Market

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Lusophone Market News Sentiment Analysis Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Analysing financial sentiment in Brazilian Portuguese provides a critical benchmark for tailoring language models to the nuances of the Lusophone economic market. This resource builds upon the established Financial Phrase Bank to offer a set of economic texts manually annotated for semantic orientation. By bridging the gap between English-centric datasets and the specific linguistic requirements of the Portuguese-speaking financial world, it facilitates the development of automated tools for market monitoring and investment sentiment tracking.

Columns

  • y: The sentiment label assigned by a consensus of human annotators, categorised as neutral (accounting for 59% of records), positive (28%), or negative.
  • text: The raw financial news snippet in its original English language as found in the source dataset.
  • text_pt: The manually validated translation of the news snippet into Brazilian Portuguese, designed for local sentiment analysis tasks.

Distribution

The data is provided in a CSV file titled financial_phrase_bank_pt_br.csv with a size of approximately 1.37 MB. It contains 4,845 valid records across three columns. The dataset demonstrates high integrity with 100% validity for the sentiment labels and text fields, with no missing or mismatched entries. This is a static archive, and no future updates are expected.

Usage

This collection is specifically designed for fine-tuning natural language processing (NLP) models on sentiment analysis tasks within the finance domain. It serves as a valuable resource for training multi-class classification algorithms, evaluating machine translation quality for technical economic texts, and developing text-mining pipelines for Portuguese-speaking stock markets.

Coverage

The scope involves 4,845 manually annotated financial news records. While the original English texts often reflect international or Finnish economic contexts, the translated content provides direct access to the Brazilian Portuguese language for sentiment detection. The dataset provides labels based on a consensus of expert annotators, ensuring a reliable standard for training machine learning models like PyTorch.

License

CC BY-SA 3.0

Who Can Use It

Data scientists can use these records to benchmark Portuguese sentiment analysis models or practice text classification techniques. Financial analysts can employ the dataset to build tools that monitor news trends for investment signals in the Brazilian market. Additionally, academic researchers in computational linguistics may utilise the parallel English-Portuguese texts to study the semantic nuances of economic discourse.

Dataset Name Suggestions

  • Brazilian Portuguese Financial Sentiment Benchmark
  • Portuguese Financial Phrase Bank Translation
  • Manually Validated Brazilian Economic Sentiment Records
  • Lusophone Market News Sentiment Analysis Dataset
  • Portuguese-English Parallel Financial Sentiment Corpus

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

23/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format