Opendatabay APP

Gold Commodity News Sentiment Analysis Dataset

Mental Health & Wellness

Tags and Keywords

News

Nlp

Multiclass

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Gold Commodity News Sentiment Analysis Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for the commodity market, featuring over 10,000 manually annotated news headlines. It aims to provide deep insights into news sentiment and its implications for commodity prices. The headlines were collected from various news sources and evaluated by three subject experts over a period of more than 20 years, from 2000 to 2021. Each news item has been assessed across multiple dimensions, including implied price direction (up, down, or constant), whether the news discusses past or future events, and if it involves asset comparisons. This dataset is particularly valuable for developing machine learning models that can understand commodity news, which can then serve as an additional input for both short-term and long-term price forecasting models. It is also useful for creating news-based indicators for commodities. Researchers focused on text analytics and classification problems will find this dataset beneficial, although some classes are highly imbalanced, which may present challenges for machine learning algorithms.

Columns

The dataset includes the following columns:
  • Dates: The date of the news headline.
  • URL: The URL where the news headline was published.
  • News: The actual news headline text.
  • Price Direction Up: A binary indicator (1 for Yes, 0 for No) if the news headline suggests an increase in price.
  • Price Direction Constant: A binary indicator (1 for Yes, 0 for No) if the news headline suggests a stable price (no change).
  • Price Direction Down: A binary indicator (1 for Yes, 0 for No) if the news headline suggests a decrease in price.
  • Asset Comparison: A binary indicator (1 for Yes, 0 for No) if the news headline compares different assets.
  • Past Information: A binary indicator (1 for Yes, 0 for No) if the news headline refers to past events.
  • Future Information: A binary indicator (1 for Yes, 0 for No) if the news headline refers to future events.
  • Price Sentiment: The overall sentiment of the gold commodity price based on the headline, categorised as positive, negative, or other.

Distribution

The dataset contains over 10,000 unique news headlines and corresponding metadata. Data files are typically provided in CSV format. Key distribution statistics for some dimensions are as follows:
  • Dates: 3,761 unique values.
  • URL: 10,570 unique values.
  • News: 10,570 unique values.
  • Price Direction Up: 6,158 headlines do not imply up, 4,412 imply up.
  • Price Direction Constant: 10,126 headlines do not imply constant, 444 imply constant.
  • Price Direction Down: 6,658 headlines do not imply down, 3,912 imply down.
  • Asset Comparison: 8,569 headlines do not compare assets, 2,001 compare assets.
  • Past Information: 318 headlines do not discuss past information, 10,252 discuss past information.
  • Future Information: 10,251 headlines do not discuss future information, 319 discuss future information.
  • Price Sentiment: Approximately 42% positive, 36% negative, and 22% other sentiment.

Usage

This dataset is ideally suited for:
  • Developing machine learning models that understand commodity news for price forecasting.
  • Creating news-based indicators for commodity markets.
  • Evaluating text classification models in the context of news analytics.
  • Research into the impact of news on commodity market volatility.

Coverage

The dataset has a global regional coverage. It spans a significant time range of over 20 years, from 2000 to 2021, with headlines collected across this period. There are no specific demographic notes beyond the focus on gold commodity news.

License

CC-BY-NC

Who Can Use It

This dataset is primarily intended for:
  • Researchers and practitioners specialising in news analytics for commodities, who can leverage it for building predictive models.
  • Data scientists and machine learning engineers working on text classification and natural language processing tasks, especially those dealing with imbalanced datasets.
  • Financial analysts and market strategists interested in incorporating news sentiment into their commodity market analysis.

Dataset Name Suggestions

  • Gold News Sentiment Analysis Dataset
  • Commodity Market News Classifier
  • Financial News Headline Sentiment
  • Gold Price Direction News Data
  • Annotated Commodity News for ML

Attributes

Listing Stats

VIEWS

3

DOWNLOADS

0

LISTED

08/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free