Opendatabay APP

Kyle MGUS Longitudinal Survival Data

Patient Health Records & Digital Health

Tags and Keywords

Health

Cancer

Mgus

Survival

Myeloma

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Kyle MGUS Longitudinal Survival Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

This collection offers highly valuable insights into the natural history and disease dynamics of Monoclonal Gammopathy of Undetermined Significance (MGUS), which is a critical precursor condition to various plasma cell disorders, including multiple myeloma. Stemming from a foundational, long-term clinical cohort established by Dr. Robert A. Kyle at the Mayo Clinic, this meticulously curated resource has been a cornerstone in hematologic research for decades. MGUS is classified as a premalignant plasma cell disorder, and while typically asymptomatic, it carries a persistent risk of progression to more severe blood cancers or related disorders, estimated at approximately 1% annually. Key research findings from this data were initially published in the New England Journal of Medicine in 1978, with significant long-term follow-up provided in 2002.

Columns

The data is typically structured into two main files, mgus1.csv and mgus2.csv, offering variables for detailed survival analysis:
  • id: Unique identifier for the subject.
  • age: Subject age, measured in years, at the time MGUS was detected (ranging from 34 to 90 years).
  • sex: Gender of the subject (male or female).
  • dxyr: The year the diagnosis was made.
  • pcdx / subtype: Specifies the subtype of plasma cell malignancy (e.g., multiple myeloma (MM), amyloidosis (AM), macroglobulinemia (MA), or other lymphoproliferative disorders (LP)) for subjects who experienced progression.
  • pctime: The number of days elapsed from MGUS detection until the diagnosis of a plasma cell malignancy.
  • futime: Time, in days, from diagnosis until the last recorded follow-up.
  • death: Binary indicator (1 = yes) showing if follow-up ended due to the subject's death.
  • alb: Albumin level recorded at the time of MGUS diagnosis.
  • creat: Creatinine level recorded at the time of MGUS diagnosis (missing for 18% of records).
  • hgb: Hemoglobin level recorded at the time of MGUS diagnosis.
  • mspike: The size of the monoclonal protein spike at diagnosis (ranging from 0.3 to 3.2).
  • ptime: Time until progression to a plasma cell malignancy (PCM) or last contact, measured in months (found in mgus2).
  • pstat: Occurrence of progression to PCM (0 = no, 1 = yes) (found in mgus2).
  • event: Event type at the end of the interval, commonly recorded as death (74%) or plasma cell malignancy (21%).

Distribution

The information is usually presented in CSV file format. The core cohort includes 305 records for patient demographics and diagnostic variables such as age and gender. The data sets, which include mgus1.csv, typically feature 15 columns. Although the data may have been slightly perturbed for patient confidentiality when packaged for statistical platforms like the R survival package, the essential statistical results and integrity remain preserved.

Usage

This longitudinal data set is considered foundational in medical statistics and is extensively cited in medical literature and textbooks. Researchers and analysts utilise this information to:
  • Study the specific natural history and pathway of MGUS progression.
  • Identify significant risk factors contributing to progression toward multiple myeloma and related malignancies.
  • Develop and rigorously test prognostic models for plasma cell disorders.
  • Practise and demonstrate advanced survival analysis techniques, such as the Cox Model extension.

Coverage

The data originates from sequential patients followed longitudinally at the Mayo Clinic, located in Rochester, Minnesota, USA. The age range of the subjects is wide, spanning from 34 to 90 years old. Diagnosis years primarily cluster around the late 1960s and early 1970s. The long-term follow-up ensures a deep, decades-long perspective on the condition, ensuring valuable long-term data points for analysis.

License

CC0: Public Domain

Who Can Use It

This material is ideal for:
  • Hematologists and Oncologists: To inform clinical research on plasma cell disorder dynamics and early detection protocols.
  • Biostatisticians: For methodological studies, especially for developing and testing novel survival models.
  • Academic Researchers: To explore disease associations and validate existing prognostic indicators, particularly those concerning M-protein concentration and light-chain ratio.
  • Students and Educators: As a recognised standard dataset for teaching statistical methods, particularly in medical data analysis.

Dataset Name Suggestions

  • Mayo Clinic MGUS Progression Study
  • Kyle MGUS Longitudinal Survival Data
  • Monoclonal Gammopathy Long-Term Follow-up
  • Plasma Cell Disorder Progression Risk Factors

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

11/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in ZIP Format