Malware classification on time series data through machine learning

Malware classification can be a challenge considering the great amount of variety and increasing emergence of malware, as well as, available classification methods. For this reason, it is not unusual for a file to be considered a different type of malicious file by different classifiers. In fact, an...

Full description

Bibliographic Details
Main Author:	Diogo Moutinho de Almeida (author)
Format:	masterThesis
Language:	eng
Published:	2016
Subjects:	Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering
Online Access:	https://hdl.handle.net/10216/85701
Country:	Portugal
Oai:	oai:repositorio-aberto.up.pt:10216/85701

Description
Summary:	Malware classification can be a challenge considering the great amount of variety and increasing emergence of malware, as well as, available classification methods. For this reason, it is not unusual for a file to be considered a different type of malicious file by different classifiers. In fact, an assignment made by a single classifier might change through time, as a consequence of methods refinements or new discoveries. When using multiple independent classifiers, past classifications of a certain file might help on deciding on which one to trust. This dissertation aims at finding a way to facilitate this analysis by collecting historical data on files that already have assigned their final and last classification, and determine which machine learning algorithm can better predict a new file classification given this very same data. Besides the historical data, other characteristics shall be taken into account like: source of the file, filetype and filesize. The machine learning algorithms we have used are: C4.5, Random Forests, Multi-Layer Perceptron (MLP) and Long short-term memory (LSTM). It was possible with this approach to find an alternative way in finding the correct malware classification of a file, given a multiple number of classifiers, taking into account its classification history.

Malware classification on time series data through machine learning

Similar Items

Need Help?