Opendatabay APP

Permission and API-Based Malware Data

Data Science and Analytics

Tags and Keywords

Malware

Detection

Security

Android

Classification

Trusted By
Trusted by company1Trusted by company2Trusted by company3
Permission and API-Based Malware Data Dataset on Opendatabay data marketplace

"No reviews yet"

Free

About

Designed to support malware research and the validation of detection methods. It contains the preprocessed characteristics of various malware binaries, structured specifically for classification tasks. The primary objective of using this data is to distinguish between 'Malware' and 'Goodware' based on the feature attributes extracted from the binaries. It provides a robust foundation for building and testing security-focused machine learning models.

Columns

The dataset consists of 241 attributes per instance. These features are grouped into two primary categories:
  • Permission-based features: Attributes 1 through 214 relate to permission requests and characteristics.
  • API-based features: Attributes 215 through 241 relate to Application Programming Interface calls.
The target variable is a class label indicating either 1) Malware or 2) Goodware. All attributes are highly usable and the dataset is reported to have no missing values, though some specific feature counts suggest minor missingness in individual columns.

Distribution

The data is currently available in a static format, expected never to be updated. It consists of 4,465 instances (records) and 241 attributes. The file format is anticipated to be CSV, with a size of approximately 2.17 MB (for TUANDROMD.csv). This is the preprocessed version of the original TUANDROMD data. There are no recommended official data splits provided.

Usage

Ideal applications for this data include:
  • Developing novel malware detection methods and algorithms.
  • Training and testing binary classification models (Malware vs Goodware).
  • Conducting security research on permission and API-based indicators of malicious activity.
  • Evaluating the effectiveness and robustness of existing malware classifiers.

Coverage

The data coverage is focused entirely on the technical characteristics of malware binaries. There are no explicit geographic, temporal, or demographic limitations detailed for this dataset. The scope is confined to the feature set (241 attributes) representing the characteristics of the malware instances collected.

License

Attribution 4.0 International (CC BY 4.0)

Who Can Use It

  • Cybersecurity Researchers: For building state-of-the-art detection systems.
  • Data Scientists/Machine Learning Engineers: For practicing binary classification and feature engineering on security data.
  • Academics and Students: For educational purposes related to computer science and programming, particularly in the fields of crime and classification.

Dataset Name Suggestions

  1. TUNADROMD Malware Classifier
  2. Android Binary Security Features
  3. Permission and API-Based Malware Data
  4. Goodware vs Malware Classification Set

Attributes

Listing Stats

VIEWS

1

DOWNLOADS

0

LISTED

11/11/2025

REGION

GLOBAL

Universal Data Quality Score Logo UDQSQUALITY

5 / 5

VERSION

1.0

Loading...

Free

Download Dataset in CSV Format