Mining malware to detect variants

Ahmad Azab, Robert Layton, Mamoun Alazab, Jonathan Oliver

Research output: Chapter in Book/Report/Conference proceedingConference Paper published in ProceedingsResearchpeer-review

Abstract

Cybercrime continues to be a growing challenge and malware is one of the most serious security threats on the Internet today which have been in existence from the very early days. Cyber criminals continue to develop and advance their malicious attacks. Unfortunately, existing techniques for detecting malware and analysing code samples are insufficient and have significant limitations. For example, most of malware detection studies focused only on detection and neglected the variants of the code. Investigating malware variants allows antivirus products and governments to more easily detect these new attacks, attribution, predict such or similar attacks in the future, and further analysis. The focus of this paper is performing similarity measures between different malware binaries for the same variant utilizing data mining concepts in conjunction with hashing algorithms. In this paper, we investigate and evaluate using the Trend Locality Sensitive Hashing (TLSH) algorithm to group binaries that belong to the same variant together, utilizing the k-NN algorithm. Two Zeus variants were tested, TSPY-ZBOT and MAL-ZBOT to address the effectiveness of the proposed approach. We compare TLSH to related hashing methods (SSDEEP, SDHASH and NILSIMSA) that are currently used for this purpose. Experimental evaluation demonstrates that our method can effectively detect variants of malware and resilient to common obfuscations used by cyber criminals. Our results show that TLSH and SDHASH provide the highest accuracy results in scoring an F-measure of 0.989 and 0.999 respectively.

Original languageEnglish
Title of host publicationProceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014
Place of PublicationAukland; New Zealand
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages44-53
Number of pages10
ISBN (Electronic)9781479988259
DOIs
Publication statusPublished - 1 Jan 2015
Event5th Cybercrime and Trustworthy Computing Conference, CTC 2014 - Aukland, New Zealand
Duration: 24 Nov 201425 Nov 2014

Conference

Conference5th Cybercrime and Trustworthy Computing Conference, CTC 2014
CountryNew Zealand
CityAukland
Period24/11/1425/11/14

Fingerprint

Data mining
Malware
Internet

Cite this

Azab, A., Layton, R., Alazab, M., & Oliver, J. (2015). Mining malware to detect variants. In Proceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014 (pp. 44-53). [15059252] Aukland; New Zealand: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/CTC.2014.11
Azab, Ahmad ; Layton, Robert ; Alazab, Mamoun ; Oliver, Jonathan. / Mining malware to detect variants. Proceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014. Aukland; New Zealand : IEEE, Institute of Electrical and Electronics Engineers, 2015. pp. 44-53
@inproceedings{2efc3addaa314075bdcf8b9300b907e0,
title = "Mining malware to detect variants",
abstract = "Cybercrime continues to be a growing challenge and malware is one of the most serious security threats on the Internet today which have been in existence from the very early days. Cyber criminals continue to develop and advance their malicious attacks. Unfortunately, existing techniques for detecting malware and analysing code samples are insufficient and have significant limitations. For example, most of malware detection studies focused only on detection and neglected the variants of the code. Investigating malware variants allows antivirus products and governments to more easily detect these new attacks, attribution, predict such or similar attacks in the future, and further analysis. The focus of this paper is performing similarity measures between different malware binaries for the same variant utilizing data mining concepts in conjunction with hashing algorithms. In this paper, we investigate and evaluate using the Trend Locality Sensitive Hashing (TLSH) algorithm to group binaries that belong to the same variant together, utilizing the k-NN algorithm. Two Zeus variants were tested, TSPY-ZBOT and MAL-ZBOT to address the effectiveness of the proposed approach. We compare TLSH to related hashing methods (SSDEEP, SDHASH and NILSIMSA) that are currently used for this purpose. Experimental evaluation demonstrates that our method can effectively detect variants of malware and resilient to common obfuscations used by cyber criminals. Our results show that TLSH and SDHASH provide the highest accuracy results in scoring an F-measure of 0.989 and 0.999 respectively.",
keywords = "Cyber Security, Cybercrime, Hacking, Malware, Profiling, similarity",
author = "Ahmad Azab and Robert Layton and Mamoun Alazab and Jonathan Oliver",
year = "2015",
month = "1",
day = "1",
doi = "10.1109/CTC.2014.11",
language = "English",
pages = "44--53",
booktitle = "Proceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
address = "United States",

}

Azab, A, Layton, R, Alazab, M & Oliver, J 2015, Mining malware to detect variants. in Proceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014., 15059252, IEEE, Institute of Electrical and Electronics Engineers, Aukland; New Zealand, pp. 44-53, 5th Cybercrime and Trustworthy Computing Conference, CTC 2014, Aukland, New Zealand, 24/11/14. https://doi.org/10.1109/CTC.2014.11

Mining malware to detect variants. / Azab, Ahmad; Layton, Robert; Alazab, Mamoun; Oliver, Jonathan.

Proceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014. Aukland; New Zealand : IEEE, Institute of Electrical and Electronics Engineers, 2015. p. 44-53 15059252.

Research output: Chapter in Book/Report/Conference proceedingConference Paper published in ProceedingsResearchpeer-review

TY - GEN

T1 - Mining malware to detect variants

AU - Azab, Ahmad

AU - Layton, Robert

AU - Alazab, Mamoun

AU - Oliver, Jonathan

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Cybercrime continues to be a growing challenge and malware is one of the most serious security threats on the Internet today which have been in existence from the very early days. Cyber criminals continue to develop and advance their malicious attacks. Unfortunately, existing techniques for detecting malware and analysing code samples are insufficient and have significant limitations. For example, most of malware detection studies focused only on detection and neglected the variants of the code. Investigating malware variants allows antivirus products and governments to more easily detect these new attacks, attribution, predict such or similar attacks in the future, and further analysis. The focus of this paper is performing similarity measures between different malware binaries for the same variant utilizing data mining concepts in conjunction with hashing algorithms. In this paper, we investigate and evaluate using the Trend Locality Sensitive Hashing (TLSH) algorithm to group binaries that belong to the same variant together, utilizing the k-NN algorithm. Two Zeus variants were tested, TSPY-ZBOT and MAL-ZBOT to address the effectiveness of the proposed approach. We compare TLSH to related hashing methods (SSDEEP, SDHASH and NILSIMSA) that are currently used for this purpose. Experimental evaluation demonstrates that our method can effectively detect variants of malware and resilient to common obfuscations used by cyber criminals. Our results show that TLSH and SDHASH provide the highest accuracy results in scoring an F-measure of 0.989 and 0.999 respectively.

AB - Cybercrime continues to be a growing challenge and malware is one of the most serious security threats on the Internet today which have been in existence from the very early days. Cyber criminals continue to develop and advance their malicious attacks. Unfortunately, existing techniques for detecting malware and analysing code samples are insufficient and have significant limitations. For example, most of malware detection studies focused only on detection and neglected the variants of the code. Investigating malware variants allows antivirus products and governments to more easily detect these new attacks, attribution, predict such or similar attacks in the future, and further analysis. The focus of this paper is performing similarity measures between different malware binaries for the same variant utilizing data mining concepts in conjunction with hashing algorithms. In this paper, we investigate and evaluate using the Trend Locality Sensitive Hashing (TLSH) algorithm to group binaries that belong to the same variant together, utilizing the k-NN algorithm. Two Zeus variants were tested, TSPY-ZBOT and MAL-ZBOT to address the effectiveness of the proposed approach. We compare TLSH to related hashing methods (SSDEEP, SDHASH and NILSIMSA) that are currently used for this purpose. Experimental evaluation demonstrates that our method can effectively detect variants of malware and resilient to common obfuscations used by cyber criminals. Our results show that TLSH and SDHASH provide the highest accuracy results in scoring an F-measure of 0.989 and 0.999 respectively.

KW - Cyber Security

KW - Cybercrime

KW - Hacking

KW - Malware

KW - Profiling

KW - similarity

UR - http://www.scopus.com/inward/record.url?scp=84929223739&partnerID=8YFLogxK

U2 - 10.1109/CTC.2014.11

DO - 10.1109/CTC.2014.11

M3 - Conference Paper published in Proceedings

SP - 44

EP - 53

BT - Proceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014

PB - IEEE, Institute of Electrical and Electronics Engineers

CY - Aukland; New Zealand

ER -

Azab A, Layton R, Alazab M, Oliver J. Mining malware to detect variants. In Proceedings - 5th Cybercrime and Trustworthy Computing Conference, CTC 2014. Aukland; New Zealand: IEEE, Institute of Electrical and Electronics Engineers. 2015. p. 44-53. 15059252 https://doi.org/10.1109/CTC.2014.11