Opendatabay APP

Cancer Visual Characteristics Dataset

Clinical Trials & Research

Tags and Keywords

Cancer

Classification

Diagnosis

Medical

Tumour

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Cancer Visual Characteristics Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset provides characteristics of patients diagnosed with cancer, offering a valuable resource for developing and evaluating models aimed at cancer diagnosis [1, 2]. It is designed to assist in understanding and analysing visual features of cancer to improve diagnostic methods [2].

Columns

The dataset includes the following columns:
  • id: A unique identifier for each patient [1].
  • diagnosis: Indicates the type of cancer, with 'M' for Malignant (Benign) and 'B' for Benign (Malignant) [1].
  • radius_mean: The mean value of the cancer's radius [3, 4].
  • texture_mean: The mean value of the cancer's texture [3, 5].
  • perimeter_mean: The mean value of the cancer's perimeter [3, 5].
  • area_mean: The mean value of the cancer's area [3, 6].
  • smoothness_mean: The mean value of the cancer's smoothness [3, 7].
  • compactness_mean: The mean value of the cancer's compactness [3, 7].
  • concavity_mean: The mean value of the cancer's concavity [3, 8].
  • concave points_mean: The mean value of the cancer's concave points [3, 9].
  • symmetry_mean: The mean value of the cancer's symmetry [9].
  • fractal_dimension_mean: The mean value of the cancer's fractal dimension [10].
  • radius_se: The standard error of the cancer's radius [11].
  • texture_se: The standard error of the cancer's texture [11].
  • perimeter_se: The standard error of the cancer's perimeter [12].
  • area_se: The standard error of the cancer's area [13].
  • smoothness_se: The standard error of the cancer's smoothness [13].
  • compactness_se: The standard error of the cancer's compactness [14].
  • concavity_se: The standard error of the cancer's concavity [14].
  • concave points_se: The standard error of the cancer's concave points [15].
  • symmetry_se: The standard error of the cancer's symmetry [15].
  • fractal_dimension_se: The standard error of the cancer's fractal dimension [16].
  • radius_worst: The "worst" or largest mean value of the cancer's radius [16].
  • texture_worst: The "worst" or largest mean value of the cancer's texture [17].
  • perimeter_worst: The "worst" or largest mean value of the cancer's perimeter [18].
  • area_worst: The "worst" or largest mean value of the cancer's area [18].
  • smoothness_worst: The "worst" or largest mean value of the cancer's smoothness [19].
  • compactness_worst: The "worst" or largest mean value of the cancer's compactness [20].
  • concavity_worst: The "worst" or largest mean value of the cancer's concavity [20].
  • concave points_worst: The "worst" or largest mean value of the cancer's concave points [21].
  • symmetry_worst: The "worst" or largest mean value of the cancer's symmetry [21].
  • fractal_dimension_worst: The "worst" or largest mean value of the cancer's fractal dimension [22].

Distribution

The dataset is provided as a CSV file named Cancer_Data.csv [23]. It has a size of 125.2 kB and contains 32 columns [23]. There are 569 valid records within the dataset [4].

Usage

This dataset is ideal for training or testing machine learning models and algorithms used to make cancer diagnoses [2]. Specific use cases include:
  • Binary classification problems, such as predicting cancer type (Malignant or Benign) using visual features [24].
  • Applying K-Nearest Neighbors (KNN) to diagnose cancer by considering neighbourhood relationships among patients with similar characteristics [24].
  • Utilising Support Vector Machines (SVM) for classification tasks, particularly in two-class problems, by focusing on the clear separation of cancer types [25].
  • Logistic Regression can be applied effectively for predicting cancer type [24].

Coverage

The sources do not provide explicit details regarding the geographic, time range, or demographic scope of the patient data included in this dataset.

License

CC BY-NC-SA 4.0

Who Can Use It

This dataset is suitable for:
  • Machine learning practitioners and researchers who aim to train or test algorithms for medical diagnosis [2].
  • Data scientists looking to apply classification techniques like Logistic Regression, KNN, or SVM to real-world medical data [24, 25].
  • Students and educators for educational purposes, exploring data analysis and machine learning in a healthcare context [26].
  • Anyone interested in improving cancer-related visual features and diagnostic accuracy [2].

Dataset Name Suggestions

  • Breast Cancer Diagnostic Features
  • Breast Cancer Classification Data
  • Cancer Visual Characteristics Dataset
  • Medical Image Feature Analysis for Cancer
  • Wisconsin Breast Cancer (Diagnostic) Data

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

08/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format