Opendatabay APP

Heart Failure Survival Prediction Dataset

Patient Health Records & Digital Health

Tags and Keywords

Heart

Failure

Survival

Patient

Clinical

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Heart Failure Survival Prediction Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset contains medical records of 299 patients diagnosed with heart failure, collected during their follow-up period from April to December 2015, at Faisalabad Institute of Cardiology and Allied Hospital in Faisalabad, Punjab, Pakistan. Its primary purpose is to predict the survival of patients with heart failure. Each patient profile includes 13 clinical features, offering insights into various health and lifestyle factors. The data is particularly useful for machine learning tasks such as classification, regression, and clustering, specifically to determine if a patient died or survived during the follow-up, with a focus on predicting survival from factors like serum creatinine and ejection fraction.

Columns

  • age: The patient's age in years, ranging from 40 to 95.
  • anaemia: A binary indicator (0 or 1) representing a decrease in red blood cells or haemoglobin. Anaemia is defined as haematocrit levels lower than 36%.
  • creatinine_phosphokinase (CPK): The level of the CPK enzyme in the blood, measured in mcg/L, with values between 23 and 7861. High levels may indicate heart failure or injury.
  • diabetes: A binary indicator (0 or 1) showing if the patient has diabetes.
  • ejection_fraction: The percentage of blood leaving the heart at each contraction, ranging from 14% to 80%. This measures the heart's pumping efficiency.
  • high_blood_pressure: A binary indicator (0 or 1) for whether the patient has hypertension.
  • platelets: The count of platelets in the blood, measured in kiloplatelets/mL, ranging from 25.01 to 850.00.
  • serum_creatinine: The level of creatinine in the blood, in mg/dL, with values between 0.50 and 9.40. High levels can suggest renal dysfunction.
  • serum_sodium: The level of sodium in the blood, in mEq/L, ranging from 114 to 148. Abnormal levels may indicate heart failure.
  • sex: A binary indicator (0 for woman, 1 for man), representing the patient's biological sex. The dataset includes 105 women and 194 men.
  • smoking: A binary indicator (0 or 1) showing if the patient smokes.
  • time: The follow-up period in days, ranging from 4 to 285 days, with an average of 130 days.
  • DEATH_EVENT: The target variable, a binary indicator (0 for survived, 1 for died) showing if the patient died during the follow-up period. The dataset contains 203 survived patients and 96 deceased patients.

Distribution

This dataset is typically provided as a CSV file and is structured as a table. It contains 299 instances (rows), each representing a unique heart failure patient. There are 13 features (columns) per patient profile. The file size is approximately 12.24 KB. All columns are valid, and there are no missing values reported. The dataset does exhibit an imbalance, with 67.89% of patients surviving and 32.11% succumbing during the follow-up period.

Usage

This dataset is ideal for a variety of applications, including:
  • Developing machine learning models to predict heart failure patient survival.
  • Conducting classification studies to identify key factors leading to patient mortality.
  • Performing regression analysis to understand the impact of clinical features on survival time.
  • Applying clustering techniques to identify subgroups of heart failure patients with similar characteristics or outcomes.
  • Medical research aimed at understanding and mitigating risk factors in heart failure.

Coverage

The dataset covers clinical records from Faisalabad Institute of Cardiology and Allied Hospital in Faisalabad, Punjab, Pakistan, collected between April and December 2015. The patient demographic includes 299 individuals diagnosed with heart failure, specifically those with left ventricular systolic dysfunction, classified as NYHA classes III or IV. Patients' ages range from 40 to 95 years, and the cohort consists of 105 women and 194 men. The follow-up period for these patients ranged from 4 to 285 days.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

This dataset is suitable for:
  • Data scientists and machine learning engineers interested in building predictive models for patient outcomes.
  • Medical researchers and clinicians seeking to understand the prognostic factors of heart failure.
  • Academics and students in health informatics, bioinformatics, or statistics for educational and research purposes.
  • Anyone looking to apply data analysis techniques to real-world healthcare challenges, particularly in cardiology.

Dataset Name Suggestions

  • Heart Failure Survival Prediction Dataset
  • Clinical Heart Failure Outcomes
  • Cardiovascular Patient Survival Records
  • Heart Failure Mortality Dataset
  • Patient Heart Failure Prognosis

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

1

LISTED

25/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format