Health and Lifestyle Factors Dataset
Mental Health & Wellness
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed to facilitate the analysis of various health, lifestyle, and socio-economic factors [3, 4]. It contains information on individuals with attributes related to their personal well-being and lifestyle choices [3]. The primary purpose is to enable in-depth analysis in areas such as general health, personal lifestyle patterns, and socio-economic status [3].
Columns
The dataset includes the following attributes:
- Name: The full name of the individual. This column has 196,851 unique values and is 100% valid across 414,000 records [3, 5].
- Age: The age of the individual in years, ranging from 18 to 80, with a mean of 49 years [3, 6]. This column is 100% valid [6].
- Marital Status: The marital status of the individual, with possible values including Single, Married, Divorced, and Widowed [3]. The most common status is Married, accounting for 58% of records [7].
- Education Level: The highest level of education attained, including High School, Associate Degree, Bachelor's Degree, Master's Degree, and PhD [8]. Bachelor's Degree is the most common at 30% [7].
- Number of Children: The count of children the individual has, ranging from 0 to 4, with a mean of 1.3 [7-9].
- Smoking Status: Indicates whether the individual is a Smoker, Former, or Non-smoker [8]. Non-smokers represent 60% of the dataset [9].
- Physical Activity Level: The individual's physical activity, categorised as Sedentary, Moderate, or Active [8]. Sedentary is the most prevalent level at 43% [9, 10].
- Employment Status: The individual's employment situation, either Employed or Unemployed [11]. Employed individuals make up 64% of the dataset [10].
- Income: The annual income of the individual in USD, ranging from 0.41 to 210,000 USD, with a mean of 50,700 USD [11, 12].
- Alcohol Consumption: The level of alcohol consumption, specified as Low, Moderate, or High [11]. Moderate consumption is the most common at 42% [12].
- Dietary Habits: The individual's dietary habits, categorised as Healthy, Moderate, or Unhealthy [11]. Unhealthy and Moderate are both common at 41% each [13].
- Sleep Patterns: The quality of sleep, defined as Good, Fair, or Poor [11]. Fair sleep patterns are most common at 48% [13].
- History of Mental Illness: Indicates whether the individual has a history of mental illness (Yes or No) [11]. Approximately 30% of individuals have a history of mental illness [13, 14].
- History of Substance Abuse: Indicates whether the individual has a history of substance abuse (Yes or No) [4]. Approximately 31% of individuals have a history of substance abuse [14].
- Family History of Depression: Indicates if there is a family history of depression (Yes or No) [4]. About 27% of individuals have a family history of depression [15].
- Chronic Medical Conditions: Indicates whether the individual has chronic medical conditions (Yes or No) [4]. Approximately 33% of individuals have chronic medical conditions [15].
Distribution
This dataset is provided in a CSV (Comma Separated Values) format, specifically named
depression_data.csv
[1, 5]. The file size is 47.29 MB [5]. It contains 16 columns and consists of approximately 414,000 records or rows [5]. All columns are 100% valid with no missing or mismatched values [5-7, 9, 10, 12-18]. The dataset is described as synthetic [3].Usage
This dataset is ideal for analysing various health, lifestyle, and socio-economic factors [4]. It is particularly suitable for tasks such as:
- Predictive modelling: For example, predicting health outcomes based on lifestyle.
- Clustering: Grouping individuals with similar health and lifestyle profiles.
- Exploratory data analysis: Gaining insights into the relationships between different factors [4].
Coverage
The dataset is described as synthetic and focuses on individual attributes related to health, lifestyle, and socio-economic status [3]. The provided sources do not specify any particular geographic coverage, time range, or specific demographic scope beyond the attributes listed in the columns.
License
CC BY-SA 4.0
Who Can Use It
This dataset is suitable for a wide range of users, including:
- Data scientists and machine learning engineers for developing predictive models.
- Researchers in public health, sociology, and behavioural sciences for exploratory data analysis and understanding correlations.
- Healthcare analysts and policymakers for identifying trends and informing health initiatives.
- Students and academics for educational purposes and research projects related to health and socio-economic factors.
Dataset Name Suggestions
- Health and Lifestyle Factors Dataset
- Socio-Economic Health Indicators
- Individual Health Attributes Dataset
- Synthetic Lifestyle Data
- Well-being Factors Dataset
Attributes
Original Data Source: Health and Lifestyle Factors Dataset