Opendatabay APP

British Literary Phrases Dataset

Knowledge Bundles

Tags and Keywords

Text

Literature

Psychology

Classification

Nlp

Trusted By
Trusted by company1Trusted by company2Trusted by company3
British Literary Phrases Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset features a curated collection of labelled phrases from renowned British authors, spanning from the 14th to the 21st centuries. The content has been carefully extracted sentence by sentence using Natural Language Processing (NLP) techniques. Each entry is meticulously labelled with the writer's name, the title of the book, and the century in which it was written. A unique index number is assigned to each writer, facilitating the analysis of sequential phrase patterns. This dataset is designed with no missing information and includes works from celebrated authors such as William Shakespeare, Charles Dickens, Virginia Woolf, Jane Austen, and J. K. Rowling. It serves as a valuable resource for various analytical and model-building tasks.

Columns

The dataset comprises four primary columns:
  • Sentence: A sample of a meaningful sentence, representing the extracted phrases from literary works.
  • Name of writer: The name of the British author from whom the sentence was extracted.
  • Name of Book: The title of the book from which the sentence originates.
  • Century: The century to which the literary work belongs. All columns are merged together without any missing data.

Distribution

The data files are typically provided in a CSV format. A sample file will be made available separately on the platform. The dataset is offered freely. Specific numbers for rows or records are not currently specified.

Usage

This dataset is highly suitable for a variety of applications, including:
  • Developing NLP models capable of identifying the century to which an English phrase belongs.
  • Creating NLP models that can determine which British author an English phrase is similar to.
  • Training NLP models on informal, non-scientific phrases.
  • Facilitating the prediction, when combined with newspaper data, of whether a sentence pertains to literature or non-literature.
  • Building NLP models designed to detect romantic and literary phrases.

Coverage

The dataset focuses on famous British writers, covering a substantial time range from the 14th to the 21st centuries. While the content is derived from British literature, its regional applicability for usage is global.

License

CCO

Who Can Use It

This dataset is ideal for:
  • Data scientists and machine learning engineers working on text classification, author attribution, or literary style analysis.
  • Researchers in literary studies and the humanities seeking to apply computational methods to analyse historical texts.
  • Developers creating educational tools or applications related to British literature.
  • Anyone interested in Natural Language Processing and building models for text understanding.

Dataset Name Suggestions

  • British Literary Phrases Dataset
  • Historical British Literature NLP
  • UK Author Text Corpus
  • Century-Labelled British Sentences
  • English Literary Phrases (NLP)

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

17/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format