Opendatabay APP

Lung Cancer Risk Prediction Dataset

Patient Health Records & Digital Health

Tags and Keywords

Cancer

Lung

Risk

Symptoms

Prediction

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Lung Cancer Risk Prediction Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed to aid in the prediction of lung cancer risk. It provides a low-cost method for individuals to assess their potential cancer risk and helps them make appropriate decisions based on their health status. The data was collected from an online lung cancer prediction system and includes various symptoms and risk factors, exploring the relationship between common factors, such as smoking, and lung cancer diagnosis.

Columns

The dataset comprises 16 attributes, each detailing a specific symptom or characteristic relevant to lung cancer prediction:
  • Gender: Patient's gender (M for male, F for female).
  • Age: Patient's numerical age.
  • Smoking: Indicates if the patient smokes (2 for YES, 1 for NO).
  • Yellow Fingers: Indicates the presence of yellow fingers (2 for YES, 1 for NO).
  • Anxiety: Indicates the presence of anxiety (2 for YES, 1 for NO).
  • Peer Pressure: Indicates the influence of peer pressure (2 for YES, 1 for NO).
  • Chronic Disease: Indicates the presence of a chronic disease (2 for YES, 1 for NO).
  • Fatigue: Indicates the presence of fatigue (2 for YES, 1 for NO).
  • Allergy: Indicates the presence of allergies (2 for YES, 1 for NO).
  • Wheezing: Indicates the presence of wheezing (2 for YES, 1 for NO).
  • Alcohol Consuming: Indicates alcohol consumption (2 for YES, 1 for NO).
  • Coughing: Indicates the presence of coughing (2 for YES, 1 for NO).
  • Shortness of Breath: Indicates the presence of shortness of breath (2 for YES, 1 for NO).
  • Swallowing Difficulty: Indicates the presence of swallowing difficulty (2 for YES, 1 for NO).
  • Chest Pain: Indicates the presence of chest pain (2 for YES, 1 for NO).
  • Lung Cancer: The target variable, indicating a lung cancer diagnosis (YES/NO).

Distribution

The dataset is provided as a CSV file, named lung cancer data.csv, with a file size of 11.28 kB. It contains 284 instances (rows) and 16 attributes (columns), making it a well-structured dataset for analysis.

Usage

This dataset is ideal for:
  • Developing cancer prediction systems.
  • Assessing individual lung cancer risk based on symptoms and lifestyle factors.
  • Informing healthcare decisions and preventative measures.
  • Conducting research on the correlation between various symptoms, habits, and lung cancer.

Coverage

  • Demographic Scope: The dataset includes patient data covering Gender (with 52% male and 48% female participants) and a range of Ages.
  • Geographic Scope: Not specified.
  • Time Range: Not specified.
  • Data Availability Notes: The dataset features attributes detailing several symptoms and risk factors, providing a focused view on indicators related to lung cancer.

License

CC0: Public Domain

Who Can Use It

  • Medical Researchers: For epidemiological studies and understanding disease patterns.
  • Data Scientists and Machine Learning Engineers: To build, test, and refine predictive models for health outcomes.
  • Public Health Professionals: To inform campaigns and strategies for early detection and prevention.
  • Healthcare Technology Developers: For creating applications that assess personal health risks.

Dataset Name Suggestions

  • Lung Cancer Risk Prediction Dataset
  • Symptom-Based Lung Cancer Indicators
  • Patient Lung Cancer Risk Data
  • Health Risk Factors for Lung Cancer
  • Lung Cancer Diagnostic Dataset

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

11/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format