Elysium Narrative Text Dataset
Entertainment & Media Consumption
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset features unique dialogue texts and descriptive passages extracted from the critically acclaimed role-playing game, Disco Elysium. It is an ideal resource for a variety of natural language processing (NLP) tasks, enabling in-depth text analysis and the exploration of rich game narratives. The dataset is designed to support the development and evaluation of language models, as well as academic research into interactive storytelling and linguistic patterns within digital media.
Columns
- Textline: This primary column holds 57,463 unique entries, consisting of individual dialogue lines and descriptive texts from the game. Each entry represents a distinct textual segment, offering a rich source for linguistic study and computational analysis.
Distribution
The dataset is typically provided as a data file, often in a CSV format, and comprises 57,463 unique records. The structure consists of a single file containing all the textual data within the
Textline
column. Specific details regarding file size will be updated separately on the platform, but the number of unique records is fixed at 57,463.Usage
- Training and fine-tuning Natural Language Processing (NLP) models, including applications such as text generation, sentiment analysis, and topic modelling.
- Conducting linguistic research to analyse narrative structures, character voices, and rhetorical devices employed in game dialogue.
- Developing and testing conversational AI agents or chatbots that can emulate specific literary styles.
- Supporting academic studies on video game writing, interactive fiction, and the unique challenges of parsing game-related text.
Coverage
The dataset's content is derived from the Disco Elysium game, which is globally recognised, making the dataset relevant to an international audience. It captures the unique textual universe of the game without a specific real-world geographic or temporal scope. The data's focus is on the narrative and textual elements presented within the game's fictional setting.
License
CC0
Who Can Use It
- NLP Researchers and Developers: For building and refining models that process and generate text, or for tasks such as text classification and entity recognition.
- Game Scholars and Designers: To investigate narrative methods, dialogue systems, and immersive world-building through textual analysis.
- Linguists and Literary Analysts: For studying distinct literary styles, character speech patterns, and the evolution of language in fictional contexts.
- AI and Machine Learning Practitioners: Individuals seeking a distinct and stylistically rich text corpus for various machine learning applications and experimental projects.
Dataset Name Suggestions
- Disco Elysium Game Dialogues
- ZA/UM Dialogue Corpus
- Elysium Narrative Text Dataset
- RPG Dialogue and Description Data
Attributes
Original Data Source:Disco Elysium Dialogue Texts