Coronary Artery Disease Prediction Data
Patient Health Records & Digital Health
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Focused on the prediction of heart disease, this data product provides essential clinical metrics related to coronary artery disease. It is suitable for analytical exploration and the development of machine learning models. The dataset is supplied in two formats: an original file for visualisation and data understanding, and a label encoded version specifically prepared for model building exercises. Updates to the dataset are expected to occur annually.
Columns
The product contains 14 columns detailing various patient health indicators:
- age: Patient's age (ranging from 29 to 77, with a mean of 54.5).
- sex: Gender of the patient, predominantly Male (68%).
- cp: Chest Pain type (Asymptomatic is the most frequent type, representing 48% of cases).
- trestbps: Resting Blood Sugar (ranging from 94 to 200, with a mean value of 132).
- chol: Serum Cholesterol level (ranging from 126 to 564, with a mean of 247).
- fbs: Fasting Blood Sugar, recorded as a boolean (86% of entries are False).
- restecg: Resting Electrographic Results (Normal and Left ventricular hypertrophy are equally common at 49% each).
- thalach: Maximum Heart Rate Achieved (ranging from 71 to 202, with a mean of 150).
- exang: Exercise Induced Engima, recorded as a boolean (False for 67% of patients).
- oldpeak: ST depression induced by exercise relative to rest (ranging from 0 to 6.2, mean 1.06).
- slope: Slope of Peak exercise ST segment (Unsloping and Flat segments account for the majority).
- ca: Number of Blood Vessels coloured (maximum value of 3).
- thal: Defect Type (Normal is the most frequent classification at 55%).
- class: Stage of Disease (The stage ranges from 0 to 4).
Distribution
The dataset structure includes two distinct files for different purposes:
Coronary_artery.csv contains the raw, original data, while data.csv is label encoded, making it immediately usable for model development. There are 297 valid records, with no missing or mismatched entries across the columns. Data types vary from numerical health measurements (e.g., cholesterol and heart rate) to categorical descriptors (e.g., chest pain and defect type).Usage
This dataset is ideally suited for educational purposes related to health conditions and for machine learning applications, particularly those focused on classification and predictive modelling of cardiac outcomes. It provides excellent material for initial data visualisation and feature analysis pertaining to heart disease risk factors.
Coverage
The data covers demographic attributes including age, spanning from 29 to 77, and gender. The population is skewed towards males, who make up 68% of the records. The focus is entirely on clinical indicators relevant to cardiovascular health, such as cholesterol, blood sugar, and various cardiac response metrics. No specific geographical location or temporal range is available in the data description.
License
CC0: Public Domain
Who Can Use It
Intended users include data science practitioners building predictive models for patient risk assessment, academics researching cardiovascular health indicators, and students requiring real-world data for machine learning coursework in the health domain.
Dataset Name Suggestions
- Coronary Artery Disease Prediction Data
- Heart Condition Classification Dataset
- Cardiac Health Predictor
- Cardiovascular Risk Factor Analysis
Attributes
Original Data Source: Coronary Artery Disease Prediction Data
Loading...
