Opendatabay APP

Tagore's Collected Writings Dataset

Entertainment & Media Consumption

Tags and Keywords

Arts

Entertainment

Text

Literature

Nlp

Languages

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Tagore's Collected Writings Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides a vast collection of the literary works of Rabindranath Tagore, a pivotal figure in early 20th-century Indian arts. Tagore, born in Calcutta in 1861, was a distinguished Bengali poet, short-story writer, song composer, playwright, essayist, and painter. He revolutionised Bengali literature by introducing new prose and verse forms and adopting colloquial language, moving away from traditional Sanskrit-based models. He was also instrumental in fostering cultural exchange between India and the West, becoming the first non-European Nobel Laureate for Literature in 1913. This dataset was created to offer a detailed and accessible resource for various analytical and generative tasks, enabling deeper exploration of his diverse creations.

Columns

The dataset is structured across several files, with all_collection.csv serving as a merged file. The CSV files generally include the following columns:
  • name: The title of the individual literary work.
  • collection: The name of the collection the work belongs to.
  • genre: The literary genre of the work (e.g., Drama, Essay, Novel, Poem, Song, Story, Miscellaneous).
  • content: The full text of the literary work.

Distribution

The dataset is available in two primary formats: 8 CSV files and 7 TXT files. The CSV files, such as drama.csv and novel.csv, contain individual literary items organised by genre. The all_collection.csv file consolidates all works across genres, accounting for 3438 individual items. The TXT files, like drama.txt and poem.txt, contain aggregated works within their respective genres. All content has undergone a basic preprocessing step to remove empty spaces, in-page titles, and page numbers, ensuring clean data for analysis.

Usage

This dataset is highly suitable for a variety of applications. The TXT formats are ideal for training various sequential models, such as those used in natural language processing (NLP) for text generation in the style of Rabindranath Tagore. The CSV formats are beneficial for literary analyses, comparative studies of themes, and statistical examinations of Tagore's diverse writings. It provides a valuable resource for researchers and enthusiasts aiming to explore the richness of Bengali literature through modern computational techniques.

Coverage

The works included span the creative lifetime of Rabindranath Tagore, who was born in Calcutta, India, on 7 May 1861, and passed away in Calcutta on 7 August 1941. His influence extended to introducing Indian culture to the West and vice versa. The dataset's content is rooted in Bengali literature, reflecting early 20th-century Indian artistic and intellectual movements.

License

CC0

Who Can Use It

  • Machine Learning Enthusiasts: For developing and training text generation models or other sequential models based on literary styles.
  • Literary Experts and Scholars: For in-depth literary analysis, comparative studies, and statistical research on themes present in Tagore's works.
  • Cultural Researchers: To study the historical and cultural context of Bengali and Indian literature.
  • Data Scientists: For projects involving text data analysis, content mining, and pattern recognition in large text corpora.

Dataset Name Suggestions

  • Rabindranath Tagore Literary Works
  • Tagore's Collected Writings Dataset
  • Bengali Literary Corpus: Rabindranath Tagore
  • The Works of Rabindranath Tagore
  • Tagore Text Collection

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

24/06/2025

REGION

ASIA

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format