Synthetic Breast Cancer Patient Records Dataset
Patient Health Records & Digital Health
Related Searches
Trusted By




"No reviews yet"
£145
About
Synthetic healthcare dataset has been generated to serve as an educational resource for data science, machine learning, and data analysis applications specifically in the context of breast cancer treatment and outcomes. It mirrors real-world patient records, allowing users to practice data manipulation and develop analytical skills relevant to breast cancer research.
Dataset Features:
- Age: Age of the patient at diagnosis (in years).
- Gender: Gender of the patient.
- Protein1, Protein2, Protein3, Protein4: Expression levels of specific proteins related to breast cancer.
- Tumour_Stage: Stage of the tumour (I, II, III).
- Histology: Type of breast carcinoma (Infiltrating Ductal Carcinoma, Infiltrating Lobular Carcinoma, Mucinous Carcinoma).
- ER status: Estrogen Receptor status (Positive/Negative).
- PR status: Progesterone Receptor status (Positive/Negative).
- HER2 status: Human Epidermal Growth Factor Receptor 2 (HER2) status (Positive/Negative).
- Surgery_type: Type of surgery performed (Lumpectomy, Simple Mastectomy, Modified Radical Mastectomy, Other).
- Date_of_Surgery: Date on which surgery was performed (in DD-MON-YY format).
- Date_of_Last_Visit: Date of the last visit (in DD-MON-YY format, can be null if the patient did not visit again after surgery).
- Patient_Status: Patient's status post-surgery (Alive/Dead).
Usage:
This dataset can be used for:
- Breast Cancer Research: To explore trends and patterns in treatment outcomes, recurrence rates, and other health-related metrics specific to breast cancer.
- Educational Training: To teach data cleaning, transformation, and visualization techniques specific to healthcare data, particularly in oncology.
- Predictive Modeling: To develop models that predict patient outcomes based on various factors such as age, tumor stage, and receptor status.



Coverage:
This dataset is synthetic and anonymized, making it a safe tool for experimentation and learning without compromising real patient privacy.

License:
CCO (Public Domain)
Who can use it:
- Researchers and Educators: For studies or teaching purposes in breast cancer analytics and data science.
- Data Science Enthusiasts: For learning, practising, and applying healthcare data manipulation and analysis techniques in the field of oncology.