Opendatabay APP

Multiclass Poem Classification Data

Environmental Monitoring

Tags and Keywords

Classification

Literature

Nlp

People

Popular

Genre

Poetry

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Multiclass Poem Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for the classification of poems into distinct genres using Natural Language Processing (NLP) methods. It provides a structured collection of poetic texts, each categorised into one of four primary genres: Affection, Environment, Music, and Death. The primary purpose of this dataset is to facilitate the development and evaluation of machine learning models capable of accurately identifying a poem's underlying thematic classification. It serves as a valuable resource for researchers and developers working on text classification tasks within the domain of literature and AI.

Columns

  • Genre: This column specifies the assigned genre for each poem. Possible values include 'Affection', 'Environment', 'Music', and 'Death'.
  • Poem: This column contains the full text of the poem.

Distribution

The data files are typically provided in CSV format. While precise row counts are not available, the dataset contains 150 unique poems. The distribution of poems across the identified genres is approximately: Affection (67%), Environment (17%), and Other (17%).

Usage

This dataset is ideal for training and testing machine learning models focused on text classification, particularly for literary analysis and natural language processing applications. It can be used for:
  • Developing AI models to automatically categorise poetry.
  • Research into stylistic differences across poetic genres.
  • Creating content recommendation systems based on genre.
  • Educational purposes to demonstrate NLP principles.

Coverage

The dataset has a global regional coverage, making it suitable for a wide range of applications without geographical constraints. Specific time ranges or demographic information for the poems are not provided.

License

CCO

Who Can Use It

This dataset is suitable for a diverse range of users, including:
  • Data Scientists and Machine Learning Engineers: To build, train, and evaluate NLP models for text classification.
  • Academic Researchers: For studies in computational linguistics, literary analysis, and AI applications in humanities.
  • Students: As a practical resource for learning about NLP, machine learning, and data analysis projects.
  • Content Platforms: To enhance content organisation and discovery for poetry collections.

Dataset Name Suggestions

  • Poetry Genre Classification Dataset
  • NLP Poem Genre Classifier
  • Multiclass Poem Classification Data
  • Literary Genre NLP Dataset
  • Poem Theme Classification Dataset

Attributes

Original Data Source: Poem Classification (NLP)

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

08/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free