Opendatabay APP

Lung Cancer Risk Factors Dataset

Patient Health Records & Digital Health

Tags and Keywords

Cancer

Lung

Prediction

Symptoms

Risk

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Lung Cancer Risk Factors Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Explains lung cancer risk by analysing symptoms and patient characteristics. The dataset provides a foundation for developing early prediction systems, enabling individuals to assess their cancer likelihood and make informed health decisions. It details various physiological symptoms and lifestyle risk factors associated with lung cancer.

Columns

  • Gender: Indicates the patient's biological sex (Male or Female).
  • Age: The patient's age, a continuous numerical value.
  • Smoking: Denotes smoking status (YES or NO, represented as 2 or 1).
  • Yellow_fingers: Indicates the presence of yellow fingers (YES or NO, represented as 2 or 1).
  • Anxiety: Signifies if the patient experiences anxiety (YES or NO, represented as 2 or 1).
  • Peer_pressure: Shows if the patient is under peer pressure (YES or NO, represented as 2 or 1).
  • Chronic Disease: Indicates the presence of a chronic disease (YES or NO, represented as 2 or 1).
  • Fatigue: Denotes if the patient experiences fatigue (YES or NO, represented as 2 or 1).
  • Allergy: Signifies if the patient has allergies (YES or NO, represented as 2 or 1).
  • Wheezing: Indicates the presence of wheezing (YES or NO, represented as 2 or 1).
  • Alcohol_consuming: Denotes alcohol consumption (YES or NO, represented as 2 or 1).
  • Coughing: Signifies if the patient experiences coughing (YES or NO, represented as 2 or 1).
  • Shortness_of_Breath: Indicates the presence of shortness of breath (YES or NO, represented as 2 or 1).
  • Swallowing_Difficulty: Denotes if the patient has difficulty swallowing (YES or NO, represented as 2 or 1).
  • Chest_pain: Signifies the presence of chest pain (YES or NO, represented as 2 or 1).
  • Lung_Cancer: The target variable, indicating a diagnosis of lung cancer (YES or NO).

Distribution

The dataset is in a CSV file format, named 'lung cancer survey.csv', and has a size of 11.28 kB. It contains 16 attributes and 309 instances (rows/records). The structure is tabular, with each row representing an individual's data and each column representing a specific symptom or risk factor.

Usage

Ideal for developing machine learning models to predict lung cancer risk. It can be used in health prediction systems, academic research on disease correlation, and for educational purposes to understand risk factors. Potential applications include building diagnostic support tools or public health awareness systems.

Coverage

The dataset focuses on various symptoms and risk factors associated with lung cancer. While specific geographic or demographic ranges are not explicitly stated beyond gender and age, it covers common indicators of lung cancer. The data's temporal scope or specific availability for certain groups/years is not detailed, but it provides a snapshot of patient attributes.

License

CC0: Public Domain

Who Can Use It

Researchers and data scientists interested in medical diagnostics and predictive modelling can utilise this data. Healthcare professionals may find it useful for understanding the interplay of symptoms. Students in health informatics or statistics can use it for learning and project development related to disease prediction.

Dataset Name Suggestions

  • Lung Cancer Risk Factors Dataset
  • Symptoms-Based Lung Cancer Prediction Data
  • Patient Lung Cancer Indicators
  • Medical Lung Cancer Diagnosis Dataset

Attributes

Original Data Source: Lung Cancer Risk Factors Dataset

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

12/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format