Horse Survival Classification Data
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Data provides clinical metrics and medical conditions essential for predicting the survival outcome of horses. This resource focuses on equine survival prognostication, framing the task as a classification problem based on historical medical conditions. The central challenge involves dealing with a significant volume of missing values (NAs) present across several attributes, requiring robust data preparation techniques like imputation. The outcome variable indicates whether the horse lived, died, or was euthanized.
Columns
The dataset contains 28 attributes detailing the horse's medical state. All binary fields have been converted into descriptive words. Key attributes include:
- surgery?: Indicates if the horse received surgery (Yes: 60%, No: 40%).
- Age: Horse maturity (Adult: 92% or Young (< 6 months): 8%).
- Hospital Number: Numeric case ID (Mean 1.09m, Std. Deviation 1.53m).
- rectal temperature: Linear measure in degrees Celsius. Normal is 37.8; changes reflect infection or shock (20% NA).
- pulse: Linear heart rate in beats per minute. Normal range is 30–40 for adults; elevated rates reflect pain or circulatory shock.
- respiratory rate: Linear rate, though its usefulness is doubtful due to large fluctuations (19% NA).
- temperature of extremities: Subjective indication of peripheral circulation (Cool: 36%, Normal: 26%). Cool/Cold suggests possible shock.
- peripheral pulse: Subjective measurement; reduced or absent pulse indicates poor perfusion (Normal: 38%, Reduced: 34%).
- mucous membranes: Subjective colour measurement (e.g., normal pink, bright red/injected), which is indicative of circulatory status or septicemia.
- capillary refill time: Clinical judgement of circulation; measured as less than 3 seconds (63%) or greater than/equal to 3 seconds.
- pain: Subjective judgement of pain level (e.g., alert, continuous severe pain). This is not an ordered variable and higher pain often correlates with needing surgery.
- peristalsis: Indication of gut activity (e.g., hypermotile, absent).
- abdominal distension: Important parameter (e.g., none, slight, severe). Severe distension often necessitates surgery.
- nasogastric tube / nasogastric reflux / nasogastric reflux PH: Related to gas and fluid passage from the intestine, where greater reflux suggests serious obstruction.
- packed cell volume / total protein: Linear blood measurements; rising levels indicate dehydration or compromised circulation.
- outcome: The target variable, indicating if the horse lived, died, or was euthanized.
- surgical lesion?: Retrospectively indicates if the underlying problem was surgical.
Distribution
The material is distributed as a CSV file named
horse.csv, with a size of 53.42 kB. The collection contains 299 valid records. A specific feature of this material is the high number of missing values (NAs), particularly in physiological measurements like rectal temperature and respiratory rate, which poses an immediate challenge for data cleanup.Usage
This resource is ideally used to build classification models aimed at predicting the survival outcome of a horse. Analysts should focus on cleaning the data, including imputation of the many missing values, before proceeding with model training. It is excellent for comparing the evaluation metrics of various classification algorithms and fine-tuning hyperparameters in a medically relevant context.
Coverage
The scope covers clinical and medical data collected from horses, detailing physiological parameters, diagnostic findings (like abdominocentesis results), and the nature and location of any internal lesions. The data focuses on cases requiring prognostication based on medical conditions.
License
CC0: Public Domain
Who Can Use It
The dataset is intended for beginner data scientists, researchers, and students interested in classification problems, healthcare, and applying machine learning to real-world medical data, particularly in veterinary science. It is highly suitable for those looking to practice data cleaning and imputation methods.
Dataset Name Suggestions
- Horse Survival Classification Data
- Equine Prognosis Medical Records
- Veterinary Survival Prediction Dataset
Attributes
Original Data Source: Horse Survival Classification Data
Loading...
