Mobile Spam Prediction Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
A collection of messages labelled for classification, specifically detailing whether the communication is 'spam' (unwanted) or 'ham' (legitimate). This collection provides labelled input suitable for developing systems designed to predict the nature of a message.
Columns
The data entries are structured such that the classification label is separated from the message content by spaces.
- Label: The initial word of each entry, which identifies the message as either 'ham' or 'spam'.
- Message Text: The remaining text, representing the actual SMS content.
Distribution
The collection is contained within a file named
SMSSpamCollection.csv, occupying 489.22 kB of storage. The entries feature 5573 valid records. There are 5159 unique textual values available in the collection. Crucially, the collection reports zero instances of missing or mismatched values.Usage
This data is ideal for training machine learning models to identify and filter out unwanted communications. It is particularly well-suited for supervised learning algorithms in text classification tasks, allowing users to build systems that predict whether a message is classified as spam.
Coverage
The data covers labelled text messages intended for classification purposes. Updates to this collection are expected to occur on an annual basis. No specific geographical or defined time period for data collection is provided.
License
CC0: Public Domain
Who Can Use It
- Machine Learning Developers: For constructing robust SMS filtering and prediction systems.
- Data Scientists: To perform linguistic analysis and feature engineering on malicious versus legitimate text communication.
- Researchers: To study the effectiveness of various classification algorithms on real-world mobile data.
Dataset Name Suggestions
- SMS Spam/Ham Collection
- Labelled Text Message Corpus
- Mobile Spam Prediction Data
- SMS Classification Training Set
Attributes
Original Data Source: Mobile Spam Prediction Data
Loading...
