Opendatabay APP

Horse Survival Classification Data

Data Science and Analytics

Tags and Keywords

Classification

Healthcare

Prognosis

Hospitals

Treatment

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Horse Survival Classification Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Data provides clinical metrics and medical conditions essential for predicting the survival outcome of horses. This resource focuses on equine survival prognostication, framing the task as a classification problem based on historical medical conditions. The central challenge involves dealing with a significant volume of missing values (NAs) present across several attributes, requiring robust data preparation techniques like imputation. The outcome variable indicates whether the horse lived, died, or was euthanized.

Columns

The dataset contains 28 attributes detailing the horse's medical state. All binary fields have been converted into descriptive words. Key attributes include:
  • surgery?: Indicates if the horse received surgery (Yes: 60%, No: 40%).
  • Age: Horse maturity (Adult: 92% or Young (< 6 months): 8%).
  • Hospital Number: Numeric case ID (Mean 1.09m, Std. Deviation 1.53m).
  • rectal temperature: Linear measure in degrees Celsius. Normal is 37.8; changes reflect infection or shock (20% NA).
  • pulse: Linear heart rate in beats per minute. Normal range is 30–40 for adults; elevated rates reflect pain or circulatory shock.
  • respiratory rate: Linear rate, though its usefulness is doubtful due to large fluctuations (19% NA).
  • temperature of extremities: Subjective indication of peripheral circulation (Cool: 36%, Normal: 26%). Cool/Cold suggests possible shock.
  • peripheral pulse: Subjective measurement; reduced or absent pulse indicates poor perfusion (Normal: 38%, Reduced: 34%).
  • mucous membranes: Subjective colour measurement (e.g., normal pink, bright red/injected), which is indicative of circulatory status or septicemia.
  • capillary refill time: Clinical judgement of circulation; measured as less than 3 seconds (63%) or greater than/equal to 3 seconds.
  • pain: Subjective judgement of pain level (e.g., alert, continuous severe pain). This is not an ordered variable and higher pain often correlates with needing surgery.
  • peristalsis: Indication of gut activity (e.g., hypermotile, absent).
  • abdominal distension: Important parameter (e.g., none, slight, severe). Severe distension often necessitates surgery.
  • nasogastric tube / nasogastric reflux / nasogastric reflux PH: Related to gas and fluid passage from the intestine, where greater reflux suggests serious obstruction.
  • packed cell volume / total protein: Linear blood measurements; rising levels indicate dehydration or compromised circulation.
  • outcome: The target variable, indicating if the horse lived, died, or was euthanized.
  • surgical lesion?: Retrospectively indicates if the underlying problem was surgical.

Distribution

The material is distributed as a CSV file named horse.csv, with a size of 53.42 kB. The collection contains 299 valid records. A specific feature of this material is the high number of missing values (NAs), particularly in physiological measurements like rectal temperature and respiratory rate, which poses an immediate challenge for data cleanup.

Usage

This resource is ideally used to build classification models aimed at predicting the survival outcome of a horse. Analysts should focus on cleaning the data, including imputation of the many missing values, before proceeding with model training. It is excellent for comparing the evaluation metrics of various classification algorithms and fine-tuning hyperparameters in a medically relevant context.

Coverage

The scope covers clinical and medical data collected from horses, detailing physiological parameters, diagnostic findings (like abdominocentesis results), and the nature and location of any internal lesions. The data focuses on cases requiring prognostication based on medical conditions.

License

CC0: Public Domain

Who Can Use It

The dataset is intended for beginner data scientists, researchers, and students interested in classification problems, healthcare, and applying machine learning to real-world medical data, particularly in veterinary science. It is highly suitable for those looking to practice data cleaning and imputation methods.

Dataset Name Suggestions

  • Horse Survival Classification Data
  • Equine Prognosis Medical Records
  • Veterinary Survival Prediction Dataset

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

17/12/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format