Parkinsons Disease Voice Screening
Public Health & Epidemiology
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Contains biomedical voice measurements crucial for the detection and diagnosis of Parkinson's Disease (PD). Parkinson's Disease is a degenerative neurological disorder impacting movement and speech due to reduced dopamine levels in the brain. Characteristic vocal features, such as dysarthria, hypophonia, and monotone, make voice recordings a valuable and non-invasive diagnostic tool. The data is designed to facilitate the application of machine learning algorithms for an effective screening process, potentially prior to a clinician's appointment, addressing the challenge of early and difficult PD diagnosis.
Columns
- name: An ASCII subject name combined with the recording number.
- MDVP:Fo(Hz): Represents the average vocal fundamental frequency.
- MDVP:Fhi(Hz): Indicates the maximum vocal fundamental frequency.
- MDVP:Flo(Hz): Shows the minimum vocal fundamental frequency.
- MDVP:Jitter(%): A measure of variation in fundamental frequency, expressed as a percentage.
- MDVP:Jitter(Abs): An absolute measure of jitter.
- MDVP:RAP: The Relative Average Perturbation, another measure of vocal stability.
- MDVP:PPQ: The Five-point Period Perturbation Quotient, related to vocal period variability.
- Jitter:DDP: A measure of variation in fundamental frequency.
- MDVP:Shimmer: A measure of variation in amplitude.
- MDVP:Shimmer(dB): Shimmer expressed in decibels.
- Shimmer:APQ3: The Three-point Amplitude Perturbation Quotient.
- Shimmer:APQ5: The Five-point Amplitude Perturbation Quotient.
- MDVP:APQ: A measure of variation in amplitude.
- Shimmer:DDA: A measure of variation in amplitude.
- NHR: The Noise-to-Harmonics Ratio, indicating vocal noise levels.
- HNR: The Harmonics-to-Noise Ratio, indicating the clarity of vocalisation.
- status: The health status of the individual, where '0' represents healthy and '1' represents Parkinson's Disease. This is the primary target variable for discrimination.
- RPDE: Recurrence Period Density Entropy, a nonlinear dynamical complexity measure.
- DFA: Detrended Fluctuation Analysis, another nonlinear dynamical complexity measure.
- spread1: A nonlinear dynamical complexity measure.
- spread2: A further nonlinear dynamical complexity measure.
- D2: The Correlation Dimension, a measure of data complexity.
- PPE: Pitch Period Entropy, related to the variability of the vocal pitch period.
Distribution
This dataset comprises biomedical voice measurements from 31 individuals, 23 of whom have Parkinson's Disease. It consists of 195 individual voice recordings. The data is structured with 24 columns, detailing various voice metrics. The data file is typically in CSV format. Specific numbers for rows/records are 195. There are no missing or mismatched values across all columns. This dataset is not expected to be updated in the future.
Usage
This dataset is ideally suited for:
- Developing and evaluating machine learning models for early Parkinson's Disease detection.
- Research into speech signal processing and its applications in neurological disorder diagnostics.
- Creating screening tools that utilise voice analysis to identify individuals at risk of PD.
- Educational projects demonstrating the use of biomedical data in predictive analytics.
- Exploring the characteristic vocal features associated with Parkinson's Disease progression.
Coverage
The dataset includes voice measurements from 31 individuals, focusing on distinguishing healthy people from those with Parkinson's Disease. It does not explicitly state geographical or time-range coverage, nor does it provide detailed demographic breakdowns beyond the number of participants. The core focus is on the biomedical speech characteristics across these two groups.
License
CC0: Public Domain
Who Can Use It
- Machine Learning Engineers and Data Scientists: To build and refine models for disease diagnosis.
- Medical Researchers: To study vocal biomarkers of Parkinson's Disease.
- Academics and Students: For teaching and learning in bioinformatics, signal processing, and AI in healthcare.
- Healthcare Innovators: To develop new non-invasive screening technologies.
- Public Health Professionals: To investigate new methods for population-level health screening.
Dataset Name Suggestions
- Parkinson's Vocal Biomarkers
- PD Voice Diagnostics Data
- Biomedical Speech for Parkinson's
- Parkinson's Disease Voice Screening
- Neurological Speech Analysis Dataset
Attributes
Original Data Source: Parkinsons Disease Voice Screening