Zero-Day Ransomware Detection Dataset
Data Science and Analytics
Tags and Keywords
Trusted By




"No reviews yet"
Free
About
The UGRansome dataset serves as a versatile cybersecurity resource, specifically designed for the analysis of ransomware and zero-day cyber-attacks, particularly those exhibiting cyclostationary behaviour. It is intended to support anomaly detection research and enhance cybersecurity understanding and preparedness. The dataset provides valuable information for researchers and practitioners focused on detecting and classifying ransomware and zero-day threats. It has undergone deduplication and transformation to ensure its utility.
Columns
The original UGRansome dataset comprises 14 columns. Following pre-processing steps, additional features are generated.
- Time: Timestamps for tracking attack occurrences.
- Protocol: Data relating to the network protocol used, essential for understanding attack vectors. (Note: Originally named 'Protcol' and renamed to 'Protocol' during pre-processing).
- Flag: Classifies the types of attacks.
- Family: Categorisation of ransomware families.
- Clusters: Numeric clustering information for pattern recognition.
- SeedAddress: Related to the source address (Note: Originally named 'SeddAddress' and renamed to 'SeedAddress' during pre-processing).
- ExpAddress: An address related to the exploitation.
- BTC: Quantifies financial damage in bitcoins.
- USD: Quantifies financial damage in US Dollars.
- Netflow_Bytes: Provides network flow details to observe data transfer patterns.
- IPaddress: Internet Protocol address information.
- Threats: Details about identified threats. (Note: 'Bonet' and 'NerisBonet' values are relabelled to 'Botnet' and 'NerisBotnet' respectively during pre-processing).
- Port: Port number used.
- Prediction: The predicted outcome.
Additional features created during feature engineering include:
- BTC_USD_Ratio: Ratio of BTC to USD financial damage.
- Total_Bytes: Aggregated netflow bytes per IP address.
- hour_of_day: Extracted hour from the 'Time' column.
- day_of_week: Extracted day of the week from the 'Time' column.
- Mean_BTC: Mean Bitcoin value per ransomware family.
- Mean_USD: Mean USD value per ransomware family.
- Mean_Netflow_Bytes: Mean Netflow Bytes per ransomware family.
- QFM_0 to QFM_6: Quantum Feature Mapping (QFM) features, generated by encoding classical features into a quantum state.
- RZ_BTC_USD_Ratio to RZ_Mean_Netflow_Bytes: Rotation angles applied for QFM features, for reference.
Distribution
The original dataset contains 207,533 observations and 14 columns. After initial pre-processing steps, such as handling negative 'Time' values, invalid 'ExpAddress' entries, and removing outliers, the dataset is reduced to 132,115 rows. The dataset is provided in CSV format. It is expected to be updated annually.
Usage
This dataset is ideal for:
- Anomaly detection in zero-day attacks and ransomware.
- Ransomware and zero-day threat detection and classification.
- Developing and testing cybersecurity defences through synthetic attack signatures.
- Cybersecurity research and enhancing preparedness against emerging threats.
- Machine learning model development for intrusion detection.
Coverage
The provided excerpts do not specify the geographic, time range, or demographic scope of the data.
License
CC BY-SA 4.0
Who Can Use It
- Cybersecurity Researchers: For advanced studies in anomaly detection, ransomware behaviour, and zero-day exploits.
- Data Scientists and Machine Learning Practitioners: To develop and evaluate machine learning models for threat intelligence and cybersecurity.
- Academics and Students: For master's dissertations, reports, and academic research in cloud computing, information security, and related fields.
- Security Analysts: For understanding data transfer patterns, attack vectors, and financial impacts of cyber threats.
Dataset Name Suggestions
- UGRansome Threat Analytics
- Zero-Day Ransomware Detection Dataset
- Cybersecurity Anomaly Detection Resource
- Ransomware & Zero-Day Attack Log
- RansomFEN-QFM Pre-Processed Dataset
Attributes
Original Data Source: Zero-Day Ransomware Detection Dataset