Opendatabay APP

Health and Lifestyle Factors Dataset

Mental Health & Wellness

Tags and Keywords

Health

Lifestyle

Psychology

Socioeconomic

Mental

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Health and Lifestyle Factors Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed to facilitate the analysis of various health, lifestyle, and socio-economic factors [3, 4]. It contains information on individuals with attributes related to their personal well-being and lifestyle choices [3]. The primary purpose is to enable in-depth analysis in areas such as general health, personal lifestyle patterns, and socio-economic status [3].

Columns

The dataset includes the following attributes:
  • Name: The full name of the individual. This column has 196,851 unique values and is 100% valid across 414,000 records [3, 5].
  • Age: The age of the individual in years, ranging from 18 to 80, with a mean of 49 years [3, 6]. This column is 100% valid [6].
  • Marital Status: The marital status of the individual, with possible values including Single, Married, Divorced, and Widowed [3]. The most common status is Married, accounting for 58% of records [7].
  • Education Level: The highest level of education attained, including High School, Associate Degree, Bachelor's Degree, Master's Degree, and PhD [8]. Bachelor's Degree is the most common at 30% [7].
  • Number of Children: The count of children the individual has, ranging from 0 to 4, with a mean of 1.3 [7-9].
  • Smoking Status: Indicates whether the individual is a Smoker, Former, or Non-smoker [8]. Non-smokers represent 60% of the dataset [9].
  • Physical Activity Level: The individual's physical activity, categorised as Sedentary, Moderate, or Active [8]. Sedentary is the most prevalent level at 43% [9, 10].
  • Employment Status: The individual's employment situation, either Employed or Unemployed [11]. Employed individuals make up 64% of the dataset [10].
  • Income: The annual income of the individual in USD, ranging from 0.41 to 210,000 USD, with a mean of 50,700 USD [11, 12].
  • Alcohol Consumption: The level of alcohol consumption, specified as Low, Moderate, or High [11]. Moderate consumption is the most common at 42% [12].
  • Dietary Habits: The individual's dietary habits, categorised as Healthy, Moderate, or Unhealthy [11]. Unhealthy and Moderate are both common at 41% each [13].
  • Sleep Patterns: The quality of sleep, defined as Good, Fair, or Poor [11]. Fair sleep patterns are most common at 48% [13].
  • History of Mental Illness: Indicates whether the individual has a history of mental illness (Yes or No) [11]. Approximately 30% of individuals have a history of mental illness [13, 14].
  • History of Substance Abuse: Indicates whether the individual has a history of substance abuse (Yes or No) [4]. Approximately 31% of individuals have a history of substance abuse [14].
  • Family History of Depression: Indicates if there is a family history of depression (Yes or No) [4]. About 27% of individuals have a family history of depression [15].
  • Chronic Medical Conditions: Indicates whether the individual has chronic medical conditions (Yes or No) [4]. Approximately 33% of individuals have chronic medical conditions [15].

Distribution

This dataset is provided in a CSV (Comma Separated Values) format, specifically named depression_data.csv [1, 5]. The file size is 47.29 MB [5]. It contains 16 columns and consists of approximately 414,000 records or rows [5]. All columns are 100% valid with no missing or mismatched values [5-7, 9, 10, 12-18]. The dataset is described as synthetic [3].

Usage

This dataset is ideal for analysing various health, lifestyle, and socio-economic factors [4]. It is particularly suitable for tasks such as:
  • Predictive modelling: For example, predicting health outcomes based on lifestyle.
  • Clustering: Grouping individuals with similar health and lifestyle profiles.
  • Exploratory data analysis: Gaining insights into the relationships between different factors [4].

Coverage

The dataset is described as synthetic and focuses on individual attributes related to health, lifestyle, and socio-economic status [3]. The provided sources do not specify any particular geographic coverage, time range, or specific demographic scope beyond the attributes listed in the columns.

License

CC BY-SA 4.0

Who Can Use It

This dataset is suitable for a wide range of users, including:
  • Data scientists and machine learning engineers for developing predictive models.
  • Researchers in public health, sociology, and behavioural sciences for exploratory data analysis and understanding correlations.
  • Healthcare analysts and policymakers for identifying trends and informing health initiatives.
  • Students and academics for educational purposes and research projects related to health and socio-economic factors.

Dataset Name Suggestions

  • Health and Lifestyle Factors Dataset
  • Socio-Economic Health Indicators
  • Individual Health Attributes Dataset
  • Synthetic Lifestyle Data
  • Well-being Factors Dataset

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

1

LISTED

26/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format