Opendatabay APP

Mobile Spam Prediction Data

Data Science and Analytics

Tags and Keywords

Spam

Sms

Ham

Messaging

Classification

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Mobile Spam Prediction Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

A collection of messages labelled for classification, specifically detailing whether the communication is 'spam' (unwanted) or 'ham' (legitimate). This collection provides labelled input suitable for developing systems designed to predict the nature of a message.

Columns

The data entries are structured such that the classification label is separated from the message content by spaces.
  • Label: The initial word of each entry, which identifies the message as either 'ham' or 'spam'.
  • Message Text: The remaining text, representing the actual SMS content.

Distribution

The collection is contained within a file named SMSSpamCollection.csv, occupying 489.22 kB of storage. The entries feature 5573 valid records. There are 5159 unique textual values available in the collection. Crucially, the collection reports zero instances of missing or mismatched values.

Usage

This data is ideal for training machine learning models to identify and filter out unwanted communications. It is particularly well-suited for supervised learning algorithms in text classification tasks, allowing users to build systems that predict whether a message is classified as spam.

Coverage

The data covers labelled text messages intended for classification purposes. Updates to this collection are expected to occur on an annual basis. No specific geographical or defined time period for data collection is provided.

License

CC0: Public Domain

Who Can Use It

  • Machine Learning Developers: For constructing robust SMS filtering and prediction systems.
  • Data Scientists: To perform linguistic analysis and feature engineering on malicious versus legitimate text communication.
  • Researchers: To study the effectiveness of various classification algorithms on real-world mobile data.

Dataset Name Suggestions

  • SMS Spam/Ham Collection
  • Labelled Text Message Corpus
  • Mobile Spam Prediction Data
  • SMS Classification Training Set

Attributes

Original Data Source: Mobile Spam Prediction Data

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

15/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format