CoVariants Nextstrain Data Overview
Patient Health Records & Digital Health
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
Provides a detailed overview of SARS-CoV-2 variants and specific mutations that are currently of interest. This resource enables users to identify the mutations defining a variant, explore the potential impact of these strains through provided links to academic resources and papers, track the geographic location where variants are found globally, and visualise the strains within Nextstrain structures. The data standardises lineage information by utilising the Nextstrain naming system for clarity.
Columns
- variant: Identifies the specific SARS-CoV-2 variant grouping or mutation (e.g., S.P681 or S.N501). There are 58 unique variant types catalogued.
- Country: Indicates the specific geographic location where the sequence data originated. This field covers 182 distinct countries or regions, including the USA and Canada.
- first_seq: The date on which the earliest sequence corresponding to the variant grouping was recorded (minimum date recorded is 22 October 2019).
- last_seq: The date on which the most recent sequence corresponding to the variant grouping was recorded (maximum date recorded is 28 November 2021).
- num_seqs: The total count of sequences associated with that specific variant grouping, with sequence counts reaching up to 1.29 million.
Distribution
The data is delivered in a CSV format, specifically the file named
variants.csv
, which is approximately 208.65 kB in size. This resource contains 6 columns and holds 4,314 valid records. The data is expected to be updated on a monthly basis, ensuring timely access to information regarding emerging viral strains.Usage
This resource is highly valuable for public health analysis, allowing users to closely track the global emergence and spread of specific viral mutations. It can be used as a foundation for forecasting variant expansion across various countries or regions. Researchers can utilise the data to create graphics, conduct epidemiological studies, and develop predictive models concerning future trends in viral evolution.
Coverage
Geographic coverage is broad, incorporating sequence data originating from 182 unique countries and regions worldwide. The temporal scope spans over two years, with sequence records beginning on 22 October 2019 and the latest sequence records concluding on 28 November 2021. The dataset strictly focuses on viral genomic sequencing information.
License
CC0: Public Domain
Who Can Use It
- Virologists and Epidemiologists: To study mutational impact and track viral lineage evolution.
- Public Health Agencies: For monitoring risk, informing policy decisions, and assessing the spread of variants.
- Data Scientists: For developing predictive models that chart the geographic expansion of variants.
- Researchers: To perform large-scale epidemiological studies and generate detailed analysis.
Dataset Name Suggestions
- SARS-CoV-2 Global Variant Tracker
- CoVariants Nextstrain Data Overview
- COVID-19 Viral Mutation Sequence Totals
- Global Pathogen Genomic Surveillance
Attributes
Original Data Source: CoVariants Nextstrain Data Overview