Opendatabay APP

Social Media Usage and Wellbeing Data

Social Media and Posts

Tags and Keywords

Productivity

Social

Habits

Digital

Wellbeing

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Social Media Usage and Wellbeing Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset aims to understand how daily digital habits, including social media usage, screen time, and notification exposure, correlate with an individual's productivity, stress, and overall well-being. It contains 30,000 simulated real-world records designed for machine learning workflows. The dataset is particularly valuable for practising data cleaning, preprocessing, feature selection, and multicollinearity analysis, as it includes missing values, noise, and outliers, and features a strong correlation between perceived and actual productivity scores.

Columns

  • age: Age of the individual (ranging from 18 to 65 years).
  • gender: Gender identity, categorised as Male, Female, or Other.
  • job_type: The individual's employment sector or status, such as IT, Education, or Student.
  • daily_social_media_time: Average daily hours spent on social media.
  • social_platform_preference: The most frequently used social platform (e.g., Instagram, TikTok, Telegram).
  • number_of_notifications: The daily count of mobile or social notifications received.
  • work_hours_per_day: Average hours worked each day.
  • perceived_productivity_score: A self-rated productivity score on a scale of 0 to 10.
  • actual_productivity_score: A simulated ground-truth productivity score on a scale of 0 to 10.
  • stress_level: The current stress level, rated on a scale of 1 to 10.
  • sleep_hours: Average hours of sleep per night.
  • screen_time_before_sleep: Hours spent on screens before going to sleep.
  • breaks_during_work: The number of breaks taken during work hours.
  • uses_focus_apps: A boolean indicating whether the user employs digital focus applications (True/False).
  • has_digital_wellbeing_enabled: A boolean indicating if Digital Wellbeing features are activated (True/False).
  • coffee_consumption_per_day: The number of coffee cups consumed daily.
  • days_feeling_burnout_per_month: The number of burnout days reported per month.
  • weekly_offline_hours: Total hours spent offline each week, excluding sleep.
  • job_satisfaction_score: Satisfaction with job or life responsibilities, rated on a scale of 0 to 10.
Notes on data quality: Several critical columns, including productivity, sleep, and stress, contain missing values for data imputation practice. Outliers are present in media usage, coffee intake, and notification counts. The target productivity scores are highly correlated, which is useful for testing multicollinearity.

Distribution

This dataset is provided as a CSV file, named social_media_vs_productivity.csv, with a file size of 5.57 MB. It comprises 30,000 records, each featuring 19 distinct columns. The dataset simulates realistic behavioural patterns and includes features such as missing values, noise, and outliers to provide a challenging environment for data preparation and analysis.

Usage

This dataset is ideal for a variety of analytical and machine learning applications, including:
  • Exploratory Data Analysis (EDA).
  • Developing and testing feature engineering pipelines.
  • Benchmarking machine learning models.
  • Performing statistical hypothesis testing.
  • Modelling productivity, stress, or job satisfaction based on digital behaviour patterns and exposure.
  • Projects focused on predicting burnout and mental health outcomes.
  • Practising data cleaning, preprocessing, feature scaling, encoding, binning, and creating interaction terms.

Coverage

The dataset simulates behavioural patterns of 30,000 individuals with diverse job types (e.g., IT, Education, Student), social habits, and lifestyle choices. The age range covered is 18 to 65 years, and gender identities include Male, Female, and Other. No specific geographical or time-range scope is detailed, suggesting the data represents general human digital behaviour.

License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Who Can Use It

This dataset is suitable for data scientists, machine learning engineers, academic researchers, and students. It offers a practical resource for those looking to analyse the relationships between digital habits and human factors like productivity, stress, and well-being. It is particularly useful for hands-on experience with data preparation, model building, and deriving insights into digital behaviour.

Dataset Name Suggestions

  • Digital Habits and Productivity Factors
  • Social Media Usage and Wellbeing Data
  • Individual Productivity and Digital Exposure
  • Work-Life Balance and Online Habits Dataset
  • Behavioural Productivity Metrics

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

13/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format