Opendatabay APP

AIDS Healthcare Statistics Dataset

Patient Health Records & Digital Health

Tags and Keywords

Health

Aids

Patients

Prediction

Clinical

Trusted By
Trusted by company1Trusted by company2Trusted by company3
AIDS Healthcare Statistics Dataset Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This dataset is designed for predicting AIDS virus infection in patients. It contains a collection of healthcare statistics and categorical information about individuals who have been diagnosed with AIDS. The dataset was initially published in 1996 and serves the purpose of classifying patients based on various attributes to determine if they are infected [1].

Columns

The dataset includes 23 columns, each detailing specific patient information:
  • time: Time to failure or censoring [1, 2].
  • trt: Treatment indicator, specifying the type of therapy received: 0 = ZDV only, 1 = ZDV + ddI, 2 = ZDV + Zal, 3 = ddI only [1-3].
  • age: Age of the patient in years at baseline [1, 3, 4].
  • wtkg: Patient's weight in kilograms at baseline [1, 4].
  • hemo: Hemophilia status, where 0 = no and 1 = yes [1, 4, 5].
  • homo: Homosexual activity indicator, 0 = no and 1 = yes [5, 6].
  • drugs: History of intravenous (IV) drug use, 0 = no and 1 = yes [5, 6].
  • karnof: Karnofsky score, indicating performance status on a scale of 0 to 100 [6, 7].
  • oprior: Non-ZDV antiretroviral therapy received prior to the 175-day mark, 0 = no and 1 = yes [6, 7].
  • z30: ZDV (Zidovudine) therapy received in the 30 days prior to the 175-day mark, 0 = no and 1 = yes [6-8].
  • preanti: Number of days of pre-175 antiretroviral therapy [6, 8, 9].
  • race: Patient's race, 0 = White and 1 = non-white [6, 9].
  • gender: Patient's gender, 0 = Female and 1 = Male [6, 9].
  • str2: Antiretroviral history, indicating if the patient is 0 = naive or 1 = experienced [6, 10].
  • strat: Antiretroviral history stratification: 1 = 'Antiretroviral Naive', 2 = '> 1 but <= 52 weeks of prior antiretroviral therapy', 3 = '> 52 weeks' [6, 10].
  • symptom: Symptomatic indicator, 0 = asymptomatic and 1 = symptomatic [11, 12].
  • treat: General treatment indicator, 0 = ZDV only and 1 = others [11, 12].
  • offtrt: Indicator of being off-treatment before 96+/-5 weeks, 0 = no and 1 = yes [11-13].
  • cd40: CD4 cell count at baseline [11, 13].
  • cd420: CD4 cell count at 20+/-5 weeks [11, 13, 14].
  • cd80: CD8 cell count at baseline [11, 14, 15].
  • cd820: CD8 cell count at 20+/-5 weeks [11, 15, 16].
  • infected: The target variable, indicating if the patient is infected with AIDS, 0 = No and 1 = Yes [11, 16].

Distribution

The dataset is provided in a CSV file format (AIDS_Classification.csv) [2]. It contains 2139 valid records across all 23 columns, with no mismatched or missing values reported for any attribute [2-5, 7-10, 12-16]. The file size is 142.8 kB [2].

Usage

This dataset is ideal for various applications, including:
  • Binary classification tasks to predict AIDS virus infection [1, 17].
  • Developing machine learning models for disease prediction in healthcare [17].
  • Statistical analysis of patient demographics and medical history related to AIDS [1, 11].
  • Research into the effectiveness of different antiretroviral treatments [1, 6, 11].
  • Data visualization to explore patterns and relationships within healthcare statistics [17].

Coverage

The dataset covers patient information from the context of AIDS diagnosis and clinical trials, initially published in 1996 [1]. It includes demographic details such as age (12 to 70 years), gender (Female and Male), and race (White and non-white) [1, 3, 4, 6, 9]. Medical history components include hemophilia status and history of IV drug use [1, 5, 6]. Treatment history encompasses various ZDV-based therapies and prior antiretroviral use [1, 6, 11]. Lab results feature CD4 and CD8 counts at baseline and at 20+/-5 weeks [11]. The dataset does not specify a geographic scope.

License

CC0: Public Domain

Who Can Use It

  • Data scientists and machine learning engineers: For building and evaluating classification models to predict AIDS infection [1, 17].
  • Healthcare researchers: To study patient characteristics, treatment efficacy, and disease progression related to AIDS [1, 11].
  • Students and beginners in data science: As an accessible dataset for learning binary classification and data analysis, tagged as 'Beginner' [17].
  • Public health analysts: For understanding population-level health statistics related to AIDS.

Dataset Name Suggestions

  • AIDS Patient Infection Predictor
  • HIV/AIDS Clinical Trial Study 175 Data
  • Patient AIDS Status Classification
  • AIDS Healthcare Statistics Dataset
  • ZDV Treatment Outcome Data

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

22/07/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in ZIP Format