Opendatabay APP

IBM Article Metadata and Usage

Data Science and Analytics

Tags and Keywords

Interaction

Article

Ibm

Recommender

News

Trusted By
Trusted by company1Trusted by company2Trusted by company3
IBM Article Metadata and Usage Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This data collection combines fundamental article information with subsequent user interaction records. It is divided into two distinct files: one containing the full content and metadata for all available articles, and a second file tracking user actions related to those articles. The significance of this pairing lies in its primary utility for constructing, training, and validating robust content-based or collaborative filtering recommender systems.

Columns

The primary file, articles_community.csv, consists of six columns defining the article properties:
  • Sr. no.: A sequential or serial identification number.
  • article_id: The unique identifier assigned to each published item.
  • doc_full_name: The title or designated name of the article.
  • doc_description: A short summary or description of the article's subject matter.
  • doc_body: The main textual content or body of the article.
  • doc_status: The current publishing status of the content on the host website, commonly listed as 'Live'.
Note: Details regarding the column structure of the second file, user-item-interactions.csv, are not specified.

Distribution

The dataset is composed of two files: articles_community.csv (approximately 9.28 MB) and user-item-interactions.csv. The articles file contains 1056 valid records. The dataset is classified as tabular data, typically formatted for ease of access (e.g., CSV). It is important to note that the expected update frequency for this particular collection is listed as never.

Usage

This material is exceptionally well-suited for a variety of analytical and development purposes, including:
  • Developing recommendation algorithms to suggest relevant articles to users.
  • Performing natural language processing (NLP) tasks on technical documentation and news media.
  • Analysing user interaction patterns and metrics related to content consumption.
  • Training predictive models based on large volumes of textual data.

Coverage

The data focuses exclusively on articles originating from IBM. The topics covered fall under the general categories of News, Literature, and technology-focused content. Specific geographic boundaries, precise time frames, or demographic information regarding the interacting users are not documented within the current metadata.

License

CC0: Public Domain

Who Can Use It

  • Machine Learning Engineers: To create and benchmark recommender systems and predictive models based on user behaviour.
  • Data Scientists: For text mining, NLP exploration, and analysing content distribution.
  • Academic Researchers: To study user engagement dynamics in technical literature ecosystems.

Dataset Name Suggestions

  • IBM User Interaction Logs
  • Technical Content Recommender Data
  • IBM Article Metadata and Usage
  • Literature Engagement Dataset

Attributes

Original Data Source: IBM Article Metadata and Usage

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

28/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format