Heart Disease Risk Factors Dataset
Clinical Trials & Research
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
This dataset offers insights into heart health, acquired from a multispecialty hospital in India [1]. It is designed to facilitate the development of early-stage heart disease detection models and predictive machine-learning models [1]. With 14 common features and data from 1000 subjects, it provides a valuable resource for research in cardiovascular health [1].
Columns
- Patient Identification Number (patientid): A unique numeric identifier for each patient [2].
- Age (age): The patient's age in years [2].
- Gender (gender): A binary indicator where 0 represents female and 1 represents male [2].
- Resting blood pressure (restingBP): Blood pressure measurements taken at rest, ranging from 94 to 200 mm HG [2].
- Serum cholesterol (serumcholestrol): Serum cholesterol levels, measured in mg/dl, ranging from 126 to 564 [2].
- Fasting blood sugar (fastingbloodsugar): A binary indicator where 0 signifies false and 1 signifies true, specifically for levels greater than 120 mg/dl [2].
- Chest pain type (chestpain): A nominal variable categorising chest pain into 0 (typical angina), 1 (atypical angina), 2 (non-anginal pain), and 3 (asymptomatic) [2].
- Resting electrocardiogram results (restingelectro): Nominal results from a resting electrocardiogram, categorised as 0 (normal), 1 (ST-T wave abnormality), and 2 (probable or definite left ventricular hypertrophy) [2].
- Maximum heart rate achieved (maxheartrate): The highest heart rate achieved during exercise, ranging from 71 to 202 [3].
- Exercise induced angina (exerciseangina): A binary indicator where 0 means no and 1 means yes, referring to angina induced by exercise [3].
- Oldpeak = ST (oldpeak): The ST depression induced by exercise relative to rest, a numeric value ranging from 0 to 6.2 [3].
- Slope of the peak exercise ST segment (slope): A nominal variable describing the slope of the peak exercise ST segment, categorised as 1 (upsloping), 2 (flat), and 3 (downsloping) [3].
- Number of major vessels (noofmajorvessels): The number of major vessels (0, 1, 2, or 3) coloured by fluoroscopy [3].
- Classification (target): The target variable, a binary classification indicating 0 for the absence of heart disease and 1 for the presence of heart disease [3].
Distribution
This dataset comprises 1000 subjects and features 12 distinct attributes for each subject, in addition to patient ID and the target variable [1]. The data is typically available in a CSV file format [4]. Specific details on file size in bytes are provided, for instance, a description file is 429.71 kB [5].
Usage
This dataset is ideal for building early-stage heart disease detection models [1]. It is also suitable for generating predictive machine-learning models related to cardiovascular health [1]. Researchers can use it for exploratory data analysis in the field of health sciences [6].
Coverage
The data originates from a multispecialty hospital in India [1]. It includes demographic information such as age and gender for 1000 subjects [1, 2]. The dataset does not specify a particular time range for data collection but was published on 16 April 2021 [1].
License
CC0: Public Domain
Who Can Use It
- Machine Learning Engineers: For developing and training predictive models for heart disease [1].
- Data Scientists: For performing exploratory data analysis and extracting insights into cardiovascular health [6].
- Medical Researchers: To study risk factors and patterns associated with heart conditions [6].
- Academics and Students: For educational purposes, research projects, and developing new algorithms in health informatics [1, 6].
Dataset Name Suggestions
- Indian Heart Disease Predictor
- Cardiovascular Health Insights (India)
- Heart Disease Risk Factors Dataset
- Multi-feature Heart Health Dataset
- Cardiac Predictive Model Data
Attributes
Original Data Source: Heart Disease Risk Factors Dataset