Opendatabay APP

Heart Disease Prediction Dataset

Patient Health Records & Digital Health

Tags and Keywords

Heart

Disease

Health

Medical

Patient

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Heart Disease Prediction Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for the exploration and diagnosis of heart disease. It comprises various medical attributes collected from patients, primarily from the Cleveland database, and aims to facilitate machine learning experiments focused on distinguishing the presence or absence of heart disease in patients. The dataset is particularly useful for building predictive models for cardiac conditions.

Columns

  • Age: Patient's age.
  • Sex: Patient's sex (1 = male; 0 = female).
  • ChestPain: Type of chest pain experienced (typical, asymptotic, nonanginal, nontypical).
  • RestBP: Resting blood pressure.
  • Chol: Serum cholesterol in mg/dl.
  • Fbs: Fasting blood sugar > 120 mg/dl (1 = true; 0 = false).
  • RestECG: Resting electrocardiographic results.
  • MaxHR: Maximum heart rate achieved.
  • ExAng: Exercise-induced angina (1 = yes; 0 = no).
  • Oldpeak: ST depression induced by exercise relative to rest.
  • Slope: Slope of the peak exercise ST segment.
  • Ca: Number of major vessels coloured by fluoroscopy (0 - 3).
  • Thal: Thallium stress test results (3 = normal; 6 = fixed defect; 7 = reversible defect).
  • target: Diagnosis of heart disease (1 = yes; 0 = no).

Distribution

The dataset is provided in CSV format and is approximately 11.33 KB in size. It contains 14 columns and consists of 303 individual records or rows.

Usage

This dataset is ideal for a variety of applications, including:
  • Developing and testing machine learning models for heart disease prediction.
  • Conducting exploratory data analysis to identify key indicators and patterns related to heart conditions.
  • Understanding the relationships between various medical attributes and the presence of heart disease.
  • Educational purposes, particularly for beginners in data science and machine learning.
  • Developing model explainability techniques for medical diagnostic models.

Coverage

The dataset's origin includes contributions from medical centres in Budapest, Zurich, Basel, Long Beach, and the Cleveland Clinic Foundation, indicating a scope that draws from multiple international sources. While no specific time range for data collection is given, the dataset is noted as having a static, 'never' expected update frequency. Demographic details include patient age ranging from 29 to 77 years and sex distribution, with patient names and social security numbers having been replaced by dummy values for privacy.

License

CC BY-SA 4.0.

Who Can Use It

This dataset is well-suited for:
  • Machine Learning Researchers: To build and refine models for predicting heart disease.
  • Data Scientists: For conducting in-depth analyses of health data and understanding risk factors.
  • Healthcare Analysts: To gain insights into patient characteristics associated with cardiac conditions.
  • Students and Educators: As a foundational dataset for learning data analysis, machine learning, and medical informatics.
  • Beginners in Data Science: Due to its manageability and clear focus on a significant health condition.

Dataset Name Suggestions

  • Heart Disease Prediction Dataset
  • Cleveland Heart Disease Patient Data
  • Cardiac Condition Attributes
  • Heart Health Diagnosis Dataset
  • Medical Heart Disease Data

Attributes

Original Data Source: Heart Disease Prediction Dataset

Listing Stats

VIEWS

1

DOWNLOADS

1

LISTED

27/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format