Long-term Malware Detonation PCAPs Feed Annual License

Software and Technology

Tags and Keywords

Pcap

Packets

Network

Traffic

Honeynet

Deception

Activedirectory

Windows

Adversary

Telemetry

Cybersecurity

Infosec

Threatintel

Forensics

Malware

C2

Exfiltration

Intrusion

Attack

Tls

Dns

Http

Smb

Kerberos

Ldap

Rdp

Groundtruth

Real

Authentic

Long-term Malware Detonation PCAPs Feed Annual License Dataset on Opendatabay data marketplace

"No reviews yet"

£22,197

About

Deception.Pro Adversary PCAP Dataset

Full-packet captures (PCAP) collected from persistent, instrumented Windows Active Directory honeynet environments operated by Deception.Pro. Every byte in this dataset originates from real, adversary activity against production-grade decoy infrastructure — no synthetic traffic, no simulated attacks, no replayed samples. The captures document genuine intrusions including initial access, credential theft, lateral movement, command-and-control beaconing, and data exfiltration, providing ground-truth network traffic for training detection models, validating NDR/IDS systems, and developing AI-driven security tooling.

Data Product Features

  • Full-fidelity packet captures with complete payloads, headers, and timing preserved
  • TLS-inspected sessions where applicable, with decrypted payloads for in-honeynet traffic
  • Multi-protocol coverage spanning Windows enterprise environments (SMB, Kerberos, LDAP, RDP, DNS, HTTP/S)
  • Clean ground truth — every flow is adversarial or adversary-adjacent, with minimal benign noise

Distribution

PCAP files delivered via secure download (S3-compatible presigned URLs) or direct transfer.
  • Data Volume: Multi-gigabyte corpus across hundreds of distinct intrusion sessions; ongoing collection with monthly delta releases
  • File Format: raw .pcap (libpcap-compatible) Structure: Captures organized by honeynet operation ID and date

Usage

This data product is ideal for a variety of applications:
  • NDR/IDS Model Training: Train and benchmark anomaly detection, intrusion detection, and traffic classification models on real adversary behavior rather than synthetic or lab-generated samples
  • LLM/AI SOC Development: Provide ground-truth packet evidence to LLM-driven security analysts for retrieval, reasoning, and triage training
  • Detection Engineering: Develop and validate Suricata, Snort, and Zeek signatures against authentic attacker tradecraft
  • Threat Intelligence Enrichment: Extract IOCs, TTPs, and behavioral fingerprints from real intrusions to enhance threat intelligence platforms
  • Academic Research: Support published research on attacker behavior, network forensics, and machine learning for security
  • Red Team / Blue Team Exercises: Use real captures as reference traffic for training, tabletop exercises, and detection validation

Coverage

  • Geographic Coverage: Global — adversary source IPs span all populated continents; honeynet infrastructure hosted across multiple regions
  • Time Range: 2024 – Present (ongoing collection)
  • Environment Coverage: Windows Server 2019/2022 Active Directory environments, member workstations, Linux edge services, and instrumented decoy services (file shares, version control, backup, object storage)
  • Threat Coverage: Initial access via exposed services and credentials, post-compromise enumeration, credential theft, lateral movement, C2 beaconing, data staging and exfiltration

License

Proprietary, Annual License

AI Training Rights

Licensee is granted a non-exclusive, worldwide, and perpetual right to:
  • Use the Data Product to train, fine-tune, and evaluate machine learning models, including large language models.
  • Incorporate Data Product content into models and commercialize resulting model outputs.
  • Create derivative works (model weights, embeddings, etc.) for any lawful purpose.

Restrictions:

  • The Data Product itself may not be sold, redistributed, or shared outside of licensed usage.
  • Licensee must comply with all applicable laws, including data protection and privacy regulations.

Who Can Use It

  • AI SOC Companies: For training LLM-based security analysts, alert triage models, and autonomous detection agents on real-world attack telemetry
  • Detection Engineering Teams: For signature development, rule validation, and false-positive reduction against authentic adversary traffic
  • ML/Data Science Teams: For training and benchmarking network traffic classifiers, anomaly detectors, and behavioral models
  • Threat Intelligence Vendors: For enriching feeds with adversary infrastructure, tooling fingerprints, and behavioral indicators
  • Academic Researchers: For peer-reviewed work in network security, intrusion detection, and applied machine learning
  • NDR/XDR Vendors: For product training, validation, and competitive benchmarking

Additional Notes:

  • All captures are sourced from honeynet infrastructure with no real user data. Any credentials, hostnames, or document content observed in traffic are deception artifacts intentionally seeded by Deception.Pro.
  • Correlated EDR and Suricata telemetry for the same intrusions are available as companion data products, enabling multi-sensor training datasets.
  • Custom collection windows, targeted attacker profiles, or environment-specific captures are available on request for enterprise licensees. Data is delivered with chain-of-custody documentation suitable for research publication and product validation contexts.
NOTE: Our other datasets are complementary and work well together!
https://blog.deception.pro

Listing Stats

VIEWS

4

DELIVERY

STREAM, API

LISTED

11/05/2026

UPDATED

13/05/2026

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

Loading...

£22,197

Download Dataset in Other Format