Tagore's Collected Writings Dataset
Entertainment & Media Consumption
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset provides a vast collection of the literary works of Rabindranath Tagore, a pivotal figure in early 20th-century Indian arts. Tagore, born in Calcutta in 1861, was a distinguished Bengali poet, short-story writer, song composer, playwright, essayist, and painter. He revolutionised Bengali literature by introducing new prose and verse forms and adopting colloquial language, moving away from traditional Sanskrit-based models. He was also instrumental in fostering cultural exchange between India and the West, becoming the first non-European Nobel Laureate for Literature in 1913. This dataset was created to offer a detailed and accessible resource for various analytical and generative tasks, enabling deeper exploration of his diverse creations.
Columns
The dataset is structured across several files, with
all_collection.csv
serving as a merged file. The CSV files generally include the following columns:- name: The title of the individual literary work.
- collection: The name of the collection the work belongs to.
- genre: The literary genre of the work (e.g., Drama, Essay, Novel, Poem, Song, Story, Miscellaneous).
- content: The full text of the literary work.
Distribution
The dataset is available in two primary formats: 8 CSV files and 7 TXT files. The CSV files, such as
drama.csv
and novel.csv
, contain individual literary items organised by genre. The all_collection.csv
file consolidates all works across genres, accounting for 3438 individual items. The TXT files, like drama.txt
and poem.txt
, contain aggregated works within their respective genres. All content has undergone a basic preprocessing step to remove empty spaces, in-page titles, and page numbers, ensuring clean data for analysis.Usage
This dataset is highly suitable for a variety of applications. The TXT formats are ideal for training various sequential models, such as those used in natural language processing (NLP) for text generation in the style of Rabindranath Tagore. The CSV formats are beneficial for literary analyses, comparative studies of themes, and statistical examinations of Tagore's diverse writings. It provides a valuable resource for researchers and enthusiasts aiming to explore the richness of Bengali literature through modern computational techniques.
Coverage
The works included span the creative lifetime of Rabindranath Tagore, who was born in Calcutta, India, on 7 May 1861, and passed away in Calcutta on 7 August 1941. His influence extended to introducing Indian culture to the West and vice versa. The dataset's content is rooted in Bengali literature, reflecting early 20th-century Indian artistic and intellectual movements.
License
CC0
Who Can Use It
- Machine Learning Enthusiasts: For developing and training text generation models or other sequential models based on literary styles.
- Literary Experts and Scholars: For in-depth literary analysis, comparative studies, and statistical research on themes present in Tagore's works.
- Cultural Researchers: To study the historical and cultural context of Bengali and Indian literature.
- Data Scientists: For projects involving text data analysis, content mining, and pattern recognition in large text corpora.
Dataset Name Suggestions
- Rabindranath Tagore Literary Works
- Tagore's Collected Writings Dataset
- Bengali Literary Corpus: Rabindranath Tagore
- The Works of Rabindranath Tagore
- Tagore Text Collection
Attributes
Original Data Source: Complete Works of Rabindranath Tagore