Global SARS-CoV-2 Variant Sequencing Data
Patient Health Records & Digital Health
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Explaining data on the global distribution and prevalence of SARS-CoV-2 variants. This dataset provides crucial information for tracking the spread of COVID-19 mutations worldwide, including Variants of Concern (VoC) such as Alpha, Beta, and Gamma. The data enables detailed analysis of sequence counts and their relative frequencies across different geographic locations and observation dates. This resource is essential for epidemiological research and understanding public health dynamics during the pandemic.
Columns
- location: The name of the country or region where the observation was made.
- date: The specific date of the observation.
- variant: The name of the SARS-CoV-2 variant. This column uses the WHO label for Variants of Concern (VoC) and Variants of Interest (VoI), and the Pango Lineage for other classifications.
- num_sequences: The absolute number of sequenced samples identified as belonging to the specified variant category.
- perc_sequences: The calculated percentage of sequenced samples that correspond to the specified variant category.
- num_sequences_total: The overall total number of samples sequenced during the preceding two weeks.
Distribution
The data is delivered in a CSV file format. The file, named
covid-variants.csv, is approximately 1.37 MB in size. The structure contains six columns and holds approximately 37,400 valid records. The expected update frequency is weekly.Usage
This dataset is ideal for epidemiological modelling of infectious disease spread and mutation rates. It can be used for spatial-temporal analysis of variant emergence and dominance. Public health organisations can use it for real-time monitoring and setting policy. Furthermore, it serves as valuable training data for Artificial Intelligence projects focused on predicting future outbreak trends or variant risks.
Coverage
The data covers 89 unique countries or regions globally. The temporal scope of the observations ranges from 11 May 2020 through to 23 August 2021.
License
CC0: Public Domain
Who Can Use It
- Public Health Officials: To monitor the geographical spread and impact of new variants.
- Epidemiologists: To conduct research into mutation kinetics and transmission rates.
- Academic Researchers: For studying global infectious disease patterns using Computer Science and Biology methodologies.
- Data Journalists: To visualise and report on critical health crisis developments worldwide.
Dataset Name Suggestions
Global SARS-CoV-2 Variant Sequencing Data
Worldwide Covid Variant Prevalence Tracker
GISAID-Derived Variant Monitoring Data
Variant Sequencing Data by Country
Attributes
Original Data Source: Global SARS-CoV-2 Variant Sequencing Data
Loading...
