English Slang Mapping Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset consists of common slang and texting abbreviation terms. Its primary purpose is to map these abbreviations to their respective full English forms and definitions, providing a valuable resource for text normalisation. The dataset is particularly significant for text pre-processing in Natural Language Processing (NLP) pipelines, ensuring that informal language can be accurately understood and analysed. All data within the dataset has been pre-processed to be in lowercase, which aids ease of use in various applications.
Columns
- Abbreviation: This column contains various slang and texting abbreviations. It holds 229 unique values, representing a collection of common shorthand terms.
- Full Form: This column provides the English equivalent or full form definition for each corresponding abbreviation.
Distribution
The dataset is typically provided as a CSV file, specifically named "Slang Text.csv". It is structured with two distinct columns: "Abbreviation" and "Full Form". While the exact number of rows or records is not specified, the "Abbreviation" column contains 229 unique entries.
Usage
This dataset is ideally suited for applications in data science and analytics, particularly within the domain of Natural Language Processing. Its primary use case is for text pre-processing in an NLP pipeline, helping to normalise informal text data for further analysis. It can be used to build tools that expand abbreviations, improve text understanding, and prepare conversational data for machine learning models.
Coverage
The dataset has a global regional coverage. Specific details regarding time range or demographic scope are not available in the provided information.
License
CC-BY-SA
Who Can Use It
This dataset is intended for:
- Data Scientists: For preparing and cleaning text data for various analytical tasks.
- NLP Engineers: For developing and enhancing text pre-processing modules in NLP applications.
- Researchers: Studying informal language, text normalisation, or natural language understanding.
- Developers: Building applications that interact with user-generated content containing slang and abbreviations.
Dataset Name Suggestions
- Slang Text Dataset
- Text Abbreviation Dictionary
- English Slang Mapping Data
- NLP Abbreviation Expander
- Text Normalisation Dataset
Attributes
Original Data Source: Slang Text Dataset