Opendatabay APP

CoVariants Nextstrain Data Overview

Patient Health Records & Digital Health

Tags and Keywords

Coronavirus

Variants

Nextstrain

Mutations

Sequences

Trusted By
Trusted by company1Trusted by company2Trusted by company3
CoVariants Nextstrain Data Overview Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Provides a detailed overview of SARS-CoV-2 variants and specific mutations that are currently of interest. This resource enables users to identify the mutations defining a variant, explore the potential impact of these strains through provided links to academic resources and papers, track the geographic location where variants are found globally, and visualise the strains within Nextstrain structures. The data standardises lineage information by utilising the Nextstrain naming system for clarity.

Columns

  • variant: Identifies the specific SARS-CoV-2 variant grouping or mutation (e.g., S.P681 or S.N501). There are 58 unique variant types catalogued.
  • Country: Indicates the specific geographic location where the sequence data originated. This field covers 182 distinct countries or regions, including the USA and Canada.
  • first_seq: The date on which the earliest sequence corresponding to the variant grouping was recorded (minimum date recorded is 22 October 2019).
  • last_seq: The date on which the most recent sequence corresponding to the variant grouping was recorded (maximum date recorded is 28 November 2021).
  • num_seqs: The total count of sequences associated with that specific variant grouping, with sequence counts reaching up to 1.29 million.

Distribution

The data is delivered in a CSV format, specifically the file named variants.csv, which is approximately 208.65 kB in size. This resource contains 6 columns and holds 4,314 valid records. The data is expected to be updated on a monthly basis, ensuring timely access to information regarding emerging viral strains.

Usage

This resource is highly valuable for public health analysis, allowing users to closely track the global emergence and spread of specific viral mutations. It can be used as a foundation for forecasting variant expansion across various countries or regions. Researchers can utilise the data to create graphics, conduct epidemiological studies, and develop predictive models concerning future trends in viral evolution.

Coverage

Geographic coverage is broad, incorporating sequence data originating from 182 unique countries and regions worldwide. The temporal scope spans over two years, with sequence records beginning on 22 October 2019 and the latest sequence records concluding on 28 November 2021. The dataset strictly focuses on viral genomic sequencing information.

License

CC0: Public Domain

Who Can Use It

  • Virologists and Epidemiologists: To study mutational impact and track viral lineage evolution.
  • Public Health Agencies: For monitoring risk, informing policy decisions, and assessing the spread of variants.
  • Data Scientists: For developing predictive models that chart the geographic expansion of variants.
  • Researchers: To perform large-scale epidemiological studies and generate detailed analysis.

Dataset Name Suggestions

  • SARS-CoV-2 Global Variant Tracker
  • CoVariants Nextstrain Data Overview
  • COVID-19 Viral Mutation Sequence Totals
  • Global Pathogen Genomic Surveillance

Attributes

Listing Stats

VIEWS

0

DOWNLOADS

0

LISTED

04/10/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format