Opendatabay APP

French Stock Market Forecasting News Data

Stock & Market Data

Tags and Keywords

Business

News

Finance

Investing

French

Trusted By
Trusted by company1Trusted by company2Trusted by company3
French Stock Market Forecasting News Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Data details approximately 41,500 French financial news articles scraped from a prominent financial media website. The purpose of this resource is to facilitate analyses aimed at understanding the relationship between public sentiment and stock market movements. It includes English translations and VADER sentiment analyses for ease of use. This material clearly shows a correlation between news sentiment and fluctuations in the CAC 40 stock market, demonstrating its utility for financial forecasting models.

Columns

The dataset, primarily provided in the FrenchNews.csv file, includes 20 attributes:
  • Numero news: The unique identifier for the news item, which is 100% valid.
  • Numero page: The numerical page reference from where the news was scraped, which is 100% valid.
  • Numero: The news item's specific number on that page, 100% valid.
  • Date: The date of the news publication. The most common date recorded is 26.03.2020.
  • Heure: The time of the news publication.
  • Titre: The original French title of the news article.
  • Contenu: The summary or content included in the headline, with minor missing values (less than 0%).
  • Agency: The news agency source, with common examples being France 24 (32%) and Reuters (31%).
  • URL: The direct link to the news article, which is 100% valid.
  • textURL: Text embedded within the news URL itself, which has approximately 18% missing entries.
  • Nbr image: The count of images associated with the news URL, with a maximum of 18 images recorded.
  • seconds to 2010: The elapsed time in seconds between the news publication and 01/01/2010 00:00.
  • days to 2010: The elapsed time in days between the news publication and 01/01/2010 00:00.
  • dateDT: The publication date and time formatted in Python datetime.
  • Title eng: The English translation of the news title, derived using Helsinki-NLP/opus-mt-fr-en.
  • Content eng: The English translation of the news content.
  • textURL eng: The English translation of the text inside the URL.
  • Sentiment Vader Title: The calculated VADER sentiment score for the news title.
  • Sentiment Vader Text: The calculated VADER sentiment score for the main content.
  • Sentiment Vader TextURL: The calculated VADER sentiment score for the text found in the URL.

Distribution

The information is available primarily in the FrenchNews.csv file, which is 120.12 MB in size. The dataset includes approximately 41,500 total records. The data quality is high, with essential metadata like Numero news, Date, and all sentiment scores being 100% valid. There is also a post-processed file, FrenchNewsDayConcat.csv, sampled by day for comparison with the CAC 40 index.

Usage

This resource is ideally used for Natural Language Processing (NLP) challenges, especially those related to financial forecasting. It supports building models to predict the CAC 40 index, such as predicting the next day's opening price. Users can also perform analyses to compare the sentiment distribution across news titles, text content, and URLs, noting that titles are often more dramatic to attract readership. Furthermore, the news text can be used to identify main subjects trending in the financial market over time.

Coverage

The material focuses on French financial news covering a period spanning from November 2018 to March 2021. This timeframe includes major global events like the COVID-19 crisis (March 2020) and the release of the Pfizer vaccine (November 2020), which visibly impacted the recorded news sentiment and the CAC 40. The data is suitable for tracking the effects of such events on market opinion.

License

CC0: Public Domain

Who Can Use It

The dataset is intended for data scientists, financial analysts, and NLP researchers interested in the connection between news text and stock market prediction. It is useful for academics and practitioners working on financial modelling and sentiment analysis. The usability rating for this resource is 10.00.

Dataset Name Suggestions

  • French Financial News Sentiment and CAC 40 Prediction
  • French Stock Market Forecasting News Data
  • Financial Media Scraped News (2018-2021)

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

1

LISTED

17/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format