Opendatabay APP

NCI SEER Breast Cancer Analysis Dataset

Patient Health Records & Digital Health

Tags and Keywords

Cancer

Breast

Seer

Patient

Survival

Trusted By
Trusted by company1Trusted by company2Trusted by company3
NCI SEER Breast Cancer Analysis Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides vital information on female patients diagnosed with invasive breast cancer between 2000 and 2017, sourced from the 2017 November update of the National Cancer Institute's (NCI) SEER Program. It offers population-based cancer statistics, encompassing details such as patient age, race, ethnicity, cancer stage, tumour size, grade, and treatment received. This data is essential for understanding trends and outcomes in breast cancer.

Columns

  • Age: The patient's age at the time of diagnosis.
  • Race: The racial background of the patient, including categories like White and Other (American Indian/AK Native, Asian/Pacific Islander).
  • Marital Status: The marital status of the patient (e.g., Married, Single).
  • T Stage: Describes the primary tumour size and extent.
  • N Stage: Indicates the number of regional lymph nodes involved.
  • 6th Stage: Represents the overall stage of the cancer.
  • Grade: The grade of the cancer, indicating how abnormal the cancer cells look under a microscope (e.g., Moderately differentiated; Grade II).
  • A Stage: Denotes the distant metastasis status of the cancer (Regional or Distant).
  • Tumor Size: The measured size of the tumour.
  • Estrogen Status: The estrogen receptor status of the cancer (Positive or Negative).
  • Progesterone Status: The progesterone receptor status of the cancer (Positive or Negative).
  • Regional Node Examined: Specifies whether regional lymph nodes were examined.
  • Reginol Node Positive: Indicates whether regional lymph nodes tested positive for cancer.
  • Survival Months: The duration, in months, that the patient survived following diagnosis.
  • Status: The patient's vital status, whether alive or dead.

Distribution

The dataset is provided as a CSV file, with a size of 520.03 kB. It contains 16 distinct columns and includes 4,024 records, offering a substantial body of data for analysis. The sample file will be updated separately to the platform.

Usage

This dataset is ideally suited for:
  • Epidemiological studies on breast cancer incidence and survival rates.
  • Analysing the impact of demographic factors (age, race, marital status) on cancer outcomes.
  • Investigating the correlation between tumour characteristics (size, grade, stage) and patient survival.
  • Researching treatment efficacy and its relationship to various patient and tumour attributes.
  • Developing predictive models for breast cancer prognosis.

Coverage

The dataset focuses on female patients with invasive breast cancer diagnosed within the United States, reflecting population-based cancer statistics from the NCI's SEER Program. The data spans a time range from 2000 to 2017. Demographic coverage includes various age groups (from 30 to 69 years), racial categories (primarily White, with an 'Other' category for American Indian/AK Native and Asian/Pacific Islander), and marital statuses.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Medical Researchers and Oncologists: To study disease progression and treatment outcomes.
  • Epidemiologists and Public Health Professionals: For population-level health assessments and policy development.
  • Data Scientists and Statisticians: To perform statistical analysis, machine learning, and predictive modelling.
  • Academics and Students: For educational purposes and academic research projects in health informatics and biostatistics.

Dataset Name Suggestions

  • SEER Breast Cancer Patient Data and Survival Statistics
  • Invasive Breast Cancer Patient Outcomes (2000-2017)
  • US Breast Cancer Registry Data
  • NCI SEER Breast Cancer Analysis Dataset

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

10/08/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format