Opendatabay APP

Open-Source CVE and CWE Analysis Data

Public Safety & Security

Tags and Keywords

Vulnerability

Cybersecurity

Software

Nvd

Cve

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Open-Source CVE and CWE Analysis Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

vulnerabilities in Open-Source Software (OSS) reported to the National Vulnerability Database (NVD) is contained in this data product. It covers vulnerabilities reported since 1 January 2010 and is refreshed automatically on the first day of each month. The data is particularly useful for analysis and for Natural Language Processing (NLP) projects, such as predicting vulnerability types from descriptions, making it well-suited for both beginners and intermediates.

Columns

  • cve_id: A unique identifier for each vulnerability reported to the NVD, known as Common Vulnerability Enumeration.
  • cwe_id: A unique identifier for the type of software weakness, such as a buffer overflow. This is known as Common Weakness Enumeration.
  • cpe_id: An identifier for the specific Open-Source Software project and the latest version affected by the vulnerability. This is known as Common Platform Enumeration.
  • description: A text description of the vulnerability within the specified software project.
  • status: The analysis stage of the reported vulnerability (e.g., Analyzed, Modified).
  • created_at: The date and time when the vulnerability report was created.
  • modified_at: The date and time when the vulnerability analysis was last modified.

Distribution

The data is provided as a single CSV file named cve_data.csv with a size of 70.63 MB. It contains 7 columns and approximately 169,000 records.

Usage

Ideal applications for this dataset include:
  • Performing data analysis on software vulnerability trends over time.
  • Developing machine learning models to classify or predict vulnerability types from their textual descriptions.
  • Serving as a practical project for individuals at beginner and intermediate levels in Natural Language Processing (NLP).
  • Informing cybersecurity research and threat intelligence activities.

Coverage

  • Geographic: Not applicable, as the data concerns open-source software vulnerabilities which are global by nature.
  • Time Range: The dataset includes vulnerabilities reported from 4 January 2010 to 6 March 2023. It receives monthly updates.

License

CC0: Public Domain

Who Can Use It

  • Data Scientists: For building predictive models and conducting trend analysis on software weaknesses.
  • Cybersecurity Analysts: For research into vulnerability patterns and for threat intelligence.
  • Students and Beginners: As a foundational dataset for projects in data analysis and NLP.
  • Software Developers: To better understand common vulnerabilities present in open-source components.

Dataset Name Suggestions

  • NVD Open-Source Software Vulnerabilities
  • Cybersecurity Vulnerability Records (2010-Present)
  • Open-Source CVE and CWE Analysis Data
  • National Vulnerability Database OSS Records

Attributes

Listing Stats

VIEWS

2

DOWNLOADS

0

LISTED

28/09/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Free

Download Dataset in CSV Format