Opendatabay APP

Healthcare Cancer Analysis Dataset

Patient Health Records & Digital Health

Tags and Keywords

Cancer

Health

Global

Trends

Patients

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Healthcare Cancer Analysis Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset presents global cancer patient data collected from 2015 to 2024, aiming to simulate the principal elements influencing cancer diagnosis, treatment, and patient survival. It incorporates a range of features commonly examined within the medical domain, such as patient age, gender, specific cancer types, environmental considerations, and various lifestyle behaviours. This resource offers a broad insight into worldwide cancer trends, making it an ideal tool for individuals engaging in data science, machine learning, and statistical analysis within the healthcare sector.

Columns

  • Patient_ID: A unique identifier for each patient. There are 50,000 distinct patient IDs.
  • Age: Represents the patient's age, ranging from 20 to 90 years. The mean age is approximately 54.4 years.
  • Gender: Categorises the patient's gender as Male, Female, or Other.
  • Country_Region: Indicates the patient's country or region, with ten distinct locations recorded. Australia and the UK are among the listed regions.
  • Year: The year the data was reported, spanning from 2015 to 2024.
  • Genetic_Risk: A numerical score from 0 to 10 representing genetic risk factors. The mean score is 5.
  • Air_Pollution: A numerical score from 0 to 10 indicating the impact of air pollution. The mean score is approximately 5.01.
  • Alcohol_Use: A numerical score from 0 to 10 detailing alcohol consumption. The mean score is approximately 5.01.
  • Smoking: A numerical score from 0 to 10 reflecting smoking habits. The mean score is approximately 4.99.
  • Obesity_Level: A numerical score from 0 to 10 indicating obesity levels. The mean score is approximately 4.99.
  • Cancer_Type: Specifies various types of cancer, such as Breast, Lung, and Colon. Eight distinct cancer types are included.
  • Cancer_Stage: Denotes the stage of cancer, from Stage 0 to Stage IV. Five unique stages are recorded.
  • Treatment_Cost_USD: The estimated cost of cancer treatment in US Dollars, ranging from £5,000 to £100,000. The average cost is around £52,500.
  • Survival_Years: The number of years a patient survived since diagnosis, ranging from 0 to 10 years. The mean survival is approximately 5.01 years.
  • Target_Severity_Score: A combined score representing the severity of cancer, ranging from 0.9 to 9.16. The average severity score is approximately 4.95.
All columns have 50,000 valid entries, with no mismatched or missing values.

Distribution

This dataset is typically provided as a CSV file. The sample file, named global_cancer_patients_2015_2024.csv, has a size of 4.21 MB and contains 15 columns. It comprises 50,000 individual records or rows.

Usage

This dataset is well-suited for a variety of analytical and modelling applications, including:
  • Exploratory Data Analysis (EDA) to uncover patterns and insights.
  • Multiple Linear Regression and other advanced statistical modelling tasks.
  • Feature Selection and Correlation Analysis to identify key variables.
  • Predictive Modelling for forecasting cancer severity, treatment expenditure, and patient survival outcomes.
  • Data Visualisation to create impactful graphs and charts.

Coverage

The dataset covers global cancer patient data. It spans a time range from 2015 to 2024. Demographic scope includes patient ages from 20 to 90 years and categorisation by gender (Male, Female, or Other). Geographic information is captured through the 'Country/Region' column, listing specific countries like Australia and the UK, alongside other unnamed regions. The data focuses on key factors influencing cancer diagnosis, treatment, and survival.

License

CC BY-NC-SA 4.0

Who Can Use It

This dataset is particularly useful for:
  • Individuals and students learning data science, machine learning, and statistical analysis within the healthcare domain.
  • Researchers and analysts interested in global cancer trends.
  • Healthcare professionals seeking insights into patient demographics, risk factors, and treatment outcomes.
  • Anyone looking to develop predictive models related to cancer.

Dataset Name Suggestions

  • Global Cancer Patient Insights 2015-2024
  • Worldwide Cancer Data Trends
  • Healthcare Cancer Analysis Dataset
  • Patient Cancer Statistics (2015-2024)

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

30/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format