Opendatabay APP

Global SARS-CoV-2 Variant Sequencing Data

Patient Health Records & Digital Health

Tags and Keywords

Variants

Covid

Gisaid

Sequencing

Global

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Global SARS-CoV-2 Variant Sequencing Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Explaining data on the global distribution and prevalence of SARS-CoV-2 variants. This dataset provides crucial information for tracking the spread of COVID-19 mutations worldwide, including Variants of Concern (VoC) such as Alpha, Beta, and Gamma. The data enables detailed analysis of sequence counts and their relative frequencies across different geographic locations and observation dates. This resource is essential for epidemiological research and understanding public health dynamics during the pandemic.

Columns

  • location: The name of the country or region where the observation was made.
  • date: The specific date of the observation.
  • variant: The name of the SARS-CoV-2 variant. This column uses the WHO label for Variants of Concern (VoC) and Variants of Interest (VoI), and the Pango Lineage for other classifications.
  • num_sequences: The absolute number of sequenced samples identified as belonging to the specified variant category.
  • perc_sequences: The calculated percentage of sequenced samples that correspond to the specified variant category.
  • num_sequences_total: The overall total number of samples sequenced during the preceding two weeks.

Distribution

The data is delivered in a CSV file format. The file, named covid-variants.csv, is approximately 1.37 MB in size. The structure contains six columns and holds approximately 37,400 valid records. The expected update frequency is weekly.

Usage

This dataset is ideal for epidemiological modelling of infectious disease spread and mutation rates. It can be used for spatial-temporal analysis of variant emergence and dominance. Public health organisations can use it for real-time monitoring and setting policy. Furthermore, it serves as valuable training data for Artificial Intelligence projects focused on predicting future outbreak trends or variant risks.

Coverage

The data covers 89 unique countries or regions globally. The temporal scope of the observations ranges from 11 May 2020 through to 23 August 2021.

License

CC0: Public Domain

Who Can Use It

  • Public Health Officials: To monitor the geographical spread and impact of new variants.
  • Epidemiologists: To conduct research into mutation kinetics and transmission rates.
  • Academic Researchers: For studying global infectious disease patterns using Computer Science and Biology methodologies.
  • Data Journalists: To visualise and report on critical health crisis developments worldwide.

Dataset Name Suggestions

Global SARS-CoV-2 Variant Sequencing Data Worldwide Covid Variant Prevalence Tracker GISAID-Derived Variant Monitoring Data Variant Sequencing Data by Country

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

26/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format