Lung Cancer Risk Prediction Dataset
Patient Health Records & Digital Health
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset is designed to aid in the prediction of lung cancer risk. It provides a low-cost method for individuals to assess their potential cancer risk and helps them make appropriate decisions based on their health status. The data was collected from an online lung cancer prediction system and includes various symptoms and risk factors, exploring the relationship between common factors, such as smoking, and lung cancer diagnosis.
Columns
The dataset comprises 16 attributes, each detailing a specific symptom or characteristic relevant to lung cancer prediction:
- Gender: Patient's gender (M for male, F for female).
- Age: Patient's numerical age.
- Smoking: Indicates if the patient smokes (2 for YES, 1 for NO).
- Yellow Fingers: Indicates the presence of yellow fingers (2 for YES, 1 for NO).
- Anxiety: Indicates the presence of anxiety (2 for YES, 1 for NO).
- Peer Pressure: Indicates the influence of peer pressure (2 for YES, 1 for NO).
- Chronic Disease: Indicates the presence of a chronic disease (2 for YES, 1 for NO).
- Fatigue: Indicates the presence of fatigue (2 for YES, 1 for NO).
- Allergy: Indicates the presence of allergies (2 for YES, 1 for NO).
- Wheezing: Indicates the presence of wheezing (2 for YES, 1 for NO).
- Alcohol Consuming: Indicates alcohol consumption (2 for YES, 1 for NO).
- Coughing: Indicates the presence of coughing (2 for YES, 1 for NO).
- Shortness of Breath: Indicates the presence of shortness of breath (2 for YES, 1 for NO).
- Swallowing Difficulty: Indicates the presence of swallowing difficulty (2 for YES, 1 for NO).
- Chest Pain: Indicates the presence of chest pain (2 for YES, 1 for NO).
- Lung Cancer: The target variable, indicating a lung cancer diagnosis (YES/NO).
Distribution
The dataset is provided as a CSV file, named
lung cancer data.csv
, with a file size of 11.28 kB. It contains 284 instances (rows) and 16 attributes (columns), making it a well-structured dataset for analysis.Usage
This dataset is ideal for:
- Developing cancer prediction systems.
- Assessing individual lung cancer risk based on symptoms and lifestyle factors.
- Informing healthcare decisions and preventative measures.
- Conducting research on the correlation between various symptoms, habits, and lung cancer.
Coverage
- Demographic Scope: The dataset includes patient data covering Gender (with 52% male and 48% female participants) and a range of Ages.
- Geographic Scope: Not specified.
- Time Range: Not specified.
- Data Availability Notes: The dataset features attributes detailing several symptoms and risk factors, providing a focused view on indicators related to lung cancer.
License
CC0: Public Domain
Who Can Use It
- Medical Researchers: For epidemiological studies and understanding disease patterns.
- Data Scientists and Machine Learning Engineers: To build, test, and refine predictive models for health outcomes.
- Public Health Professionals: To inform campaigns and strategies for early detection and prevention.
- Healthcare Technology Developers: For creating applications that assess personal health risks.
Dataset Name Suggestions
- Lung Cancer Risk Prediction Dataset
- Symptom-Based Lung Cancer Indicators
- Patient Lung Cancer Risk Data
- Health Risk Factors for Lung Cancer
- Lung Cancer Diagnostic Dataset
Attributes
Original Data Source:Lung Cancer Risk Prediction Dataset