Opendatabay APP

English Slang Mapping Data

Data Science and Analytics

Tags and Keywords

Earth

Nature

Nlp

Text

Pre-processing

Trusted By
Trusted by company1Trusted by company2Trusted by company3
English Slang Mapping Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset consists of common slang and texting abbreviation terms. Its primary purpose is to map these abbreviations to their respective full English forms and definitions, providing a valuable resource for text normalisation. The dataset is particularly significant for text pre-processing in Natural Language Processing (NLP) pipelines, ensuring that informal language can be accurately understood and analysed. All data within the dataset has been pre-processed to be in lowercase, which aids ease of use in various applications.

Columns

  • Abbreviation: This column contains various slang and texting abbreviations. It holds 229 unique values, representing a collection of common shorthand terms.
  • Full Form: This column provides the English equivalent or full form definition for each corresponding abbreviation.

Distribution

The dataset is typically provided as a CSV file, specifically named "Slang Text.csv". It is structured with two distinct columns: "Abbreviation" and "Full Form". While the exact number of rows or records is not specified, the "Abbreviation" column contains 229 unique entries.

Usage

This dataset is ideally suited for applications in data science and analytics, particularly within the domain of Natural Language Processing. Its primary use case is for text pre-processing in an NLP pipeline, helping to normalise informal text data for further analysis. It can be used to build tools that expand abbreviations, improve text understanding, and prepare conversational data for machine learning models.

Coverage

The dataset has a global regional coverage. Specific details regarding time range or demographic scope are not available in the provided information.

License

CC-BY-SA

Who Can Use It

This dataset is intended for:
  • Data Scientists: For preparing and cleaning text data for various analytical tasks.
  • NLP Engineers: For developing and enhancing text pre-processing modules in NLP applications.
  • Researchers: Studying informal language, text normalisation, or natural language understanding.
  • Developers: Building applications that interact with user-generated content containing slang and abbreviations.

Dataset Name Suggestions

  • Slang Text Dataset
  • Text Abbreviation Dictionary
  • English Slang Mapping Data
  • NLP Abbreviation Expander
  • Text Normalisation Dataset

Attributes

Original Data Source: Slang Text Dataset

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

26/06/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format