Opendatabay APP

Synthetic Lung Cancer Patient Records Prediction Dataset

Patient Health Records & Digital Health

Tags and Keywords

Lung

Cancer

Patient

Records

Prediction

Synthetic

Smoking

LLM

AI

Training

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Synthetic Lung Cancer Patient Records Prediction Dataset  Dataset on Opendatabay data marketplace

"No reviews yet"

£179.99

About

This synthetic Lung Cancer Risk Prediction Dataset is designed for educational and research purposes in the fields of data science, public health, and cancer research. It contains essential health and lifestyle indicators such as smoking habits, chronic diseases, and respiratory symptoms, which can be used to analyze and predict the risk of lung cancer. The dataset is ideal for building predictive models, conducting risk assessments, and exploring the relationships between lifestyle factors and lung health.

Dataset Features

  • Gender: The biological sex of the individual (Male/Female).
  • Age: The age of the individual in years.
  • Smoking: Whether the individual smokes (Yes/No).
  • Yellow Fingers: Whether the individual has yellow fingers (Yes/No).
  • Anxiety: Whether the individual has anxiety (Yes/No).
  • Peer Pressure: Whether the individual experiences peer pressure (Yes/No).
  • Chronic Disease: Whether the individual has any chronic diseases (Yes/No).
  • Fatigue: Whether the individual experiences fatigue (Yes/No).
  • Allergy: Whether the individual has allergies (Yes/No).
  • Wheezing: Whether the individual experiences wheezing (Yes/No).
  • Alcohol Consuming: Whether the individual consumes alcohol (Yes/No).
  • Coughing: Whether the individual experiences coughing (Yes/No).
  • Shortness of Breath: Whether the individual experiences shortness of breath (Yes/No).
  • Swallowing Difficulty: Whether the individual experiences difficulty swallowing (Yes/No).
  • Chest Pain: Whether the individual experiences chest pain (Yes/No).
  • Lung Cancer: Binary classification indicating lung cancer risk:
  • YES: At risk of lung cancer.
  • NO: Not at risk of lung cancer.

Distribution

Usage

This dataset is ideal for various lung cancer-related applications:
  • Lung Cancer Risk Prediction: Develop machine learning models to classify individuals as at risk or not at risk of lung cancer.
  • Risk Factor Analysis: Identify key factors contributing to lung cancer risks and prioritize lifestyle interventions.
  • Predictive Modeling: Build predictive models using health and lifestyle indicators to assess lung health.
  • Public Health Research: Study the relationships between health metrics, lifestyle factors, and lung cancer risks.
  • Preventive Healthcare: Inform public health campaigns and individual preventive measures.

Coverage

This synthetic dataset is anonymized, ensuring compliance with data privacy standards. It is designed for research and learning purposes, providing diverse health conditions and demographic data for analysis and model building.

License

CC0 (Public Domain)

Who Can Use It

  • Data Science Practitioners: For practicing data preprocessing, classification, and regression tasks related to lung health.
  • Healthcare Professionals and Researchers: To explore relationships between health metrics and lung cancer risks.
  • Public Health Analysts: To understand trends and develop interventions for reducing lung cancer risks.
  • Policy Makers and Regulators: For data-driven decision-making in preventive healthcare policies.

Listing Stats

VIEWS

5

DOWNLOADS

0

LISTED

26/01/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

£179.99