Opendatabay APP

Clinical Breast Cancer Analysis Dataset

Patient Health Records & Digital Health

Tags and Keywords

Cancer

Health

Breast

Medical

Patient

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Clinical Breast Cancer Analysis Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Patient-level information related to breast cancer, designed to raise awareness about breast cancer risks. It contains details on patient demographics, protein expression levels, cancer staging, tumour characteristics, and treatment information. The data facilitates analysis of the interplay between protein expression, cancer stage, and patient outcomes.

Columns

  • Patient_ID: A unique identifier for each patient.
  • Age: The age of the patient, ranging from 29 to 90 years.
  • Gender: The gender of the patient.
  • Protein1: Expression levels of Protein 1.
  • Protein2: Expression levels of Protein 2.
  • Protein3: Expression levels of Protein 3.
  • Protein4: Expression levels of Protein 4.
  • Tumour_Stage: The stage of the breast cancer (e.g., II, III).
  • Histology: The histological type of cancer, such as Infiltrating Ductal Carcinoma.
  • ER status: The status of the oestrogen receptor (Positive/Negative).
  • PR status: The status of the progesterone receptor (Positive/Negative).
  • HER2 status: The status of the human epidermal growth factor receptor 2 (Positive/Negative).
  • Surgery_type: The type of surgery performed (e.g., Modified Radical Mastectomy).
  • Date_of_Surgery: The date when the surgery was performed.
  • Date_of_Last_Visit: The date of the patient's most recent visit.
  • Patient_Status: The survival status of the patient (Alive/Dead).

Distribution

The dataset is provided in a single CSV file named breast_cancer_survival.csv with a size of 48.51 kB. It is structured with 15 columns and contains 334 records. There are missing values in the Date_of_Last_Visit and Patient_Status columns.

Usage

This information can be used to analyse the relationship between protein expression levels, cancer stage, and patient outcomes. It can also be used to understand the impact of different types of surgeries on patient survival and to identify potential risk factors for breast cancer progression. The dataset is suitable for building predictive models for patient survival and for academic research into breast cancer characteristics.

Coverage

The data covers a patient cohort with ages ranging from 29 to 90. The vast majority of patients are female (99%), with only 1% male. Surgical procedures are recorded from January 2017 to November 2019. The dataset is not expected to be updated.

License

CC0: Public Domain

Who Can Use It

  • Medical Researchers: To investigate correlations between protein levels, tumour characteristics, and patient survival rates.
  • Data Scientists and Analysts: For developing machine learning models to predict patient outcomes or cancer progression.
  • Healthcare Professionals and Students: As a resource for understanding breast cancer data and its various clinical attributes.
  • Public Health Organisations: To create visuals and materials for breast cancer awareness campaigns.

Dataset Name Suggestions

  • Breast Cancer Patient Survival Data
  • Clinical Breast Cancer Analysis Dataset
  • Breast Cancer Protein Expression and Outcomes
  • Breast Cancer Tumour Characteristics and Patient Data

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

28/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format