Throttling Malware Families in 2D
Author URLs
Document Type
Article
Publication Date
2019
Subject: LCSH
Malware (Computer software), Computer algorithms, Information visualization
Disciplines
Computer Engineering | Computer Sciences | Electrical and Computer Engineering
Abstract
Malicious software are categorized into families based on their static and dynamic characteristics, infection methods, and nature of threat. Visual exploration of malware instances and families in a low dimensional space helps in giving a first overview about dependencies and relationships among these instances, detecting their groups and isolating outliers. Furthermore, visual exploration of different sets of features is useful in assessing the quality of these sets to carry a valid abstract representation, which can be later used in classification and clustering algorithms to achieve a high accuracy. In this paper, we investigate one of the best dimensionality reduction techniques known as t-SNE to reduce the malware representation from a high dimensional space consisting of thousands of features to a low dimensional space. We experiment with different feature sets and depict malware clusters in 2-D. Surprisingly, t-SNE does not only provide nice 2-D drawings, but also dramatically increases the generalization power of SVM classifiers. Moreover, obtained results showed that cross-validation accuracy is much better using the 2-D embedded representation of samples than using the original high dimensional representation.
Repository Citation
Nassar, Mohamed and Safa, Haidar, "Throttling Malware Families in 2D" (2019). Electrical & Computer Engineering and Computer Science Faculty Publications. 123.
https://digitalcommons.newhaven.edu/electricalcomputerengineering-facpubs/123
COinS
Comments
Article presented at the12th International Conference on Autonomous Infrastructure, Management and Security (AIMS 2018) on June 4 - 5, 2019 in Munich, Germany.
Article is made available via license CC0 1.0 Universal (CC0 1.0) Public Domain Dedication.