TY - JOUR
T1 - Critical Feature Selection for Machine Learning Approaches to Detect Ransomware
AU - Malik, Sachin
AU - Shanmugam, Bharanidharan
AU - Kannorpatti, Krishnan
AU - Azam, Sami
N1 - Funding Information:
This research was supported by Charles Darwin University. We thank our colleagues from Charles Darwin University who provided insight and expertise that has help this research a lot.
Publisher Copyright:
© 2022 University of Bahrain. All rights reserved.
PY - 2022/3
Y1 - 2022/3
N2 - It has been nearly three decades since the first strain of ransomware surfaced online, but still, it is one of the most destructive malwares of all time, costing millions of dollars around the globe each year. Ransomware is a type of malware that encrypts all the data on an infected device using asymmetric encryption algorithms and demands a ransom to decrypt the data. As it is nearly impossible to recover the encrypted data without having a backup, victims end up paying the ransom or lose the data. Therefore, the best approach is to detect the ransomware at its initial stages and remove it before any damage is done. Traditional methods of signature-based detection are useless against the newer ransomware families as they exhibit polymorphic techniques and change their signatures frequently. This paper critically reviews some of the existing detection methods that use behavioural analysis using machine learning techniques. To test the efficiency and accuracy of various machine learning algorithms, logs from an infected windows machine were analysed using supervised machine learning algorithms to classify it as ransomware or non-ransomware. Secondly, the datasets were split into training and testing set to check the accuracy of the trained models and finally the most important behavioural features were determined that are most crucial in differentiating a log file from a ransomware infected machine to that of an uninfected machine.
AB - It has been nearly three decades since the first strain of ransomware surfaced online, but still, it is one of the most destructive malwares of all time, costing millions of dollars around the globe each year. Ransomware is a type of malware that encrypts all the data on an infected device using asymmetric encryption algorithms and demands a ransom to decrypt the data. As it is nearly impossible to recover the encrypted data without having a backup, victims end up paying the ransom or lose the data. Therefore, the best approach is to detect the ransomware at its initial stages and remove it before any damage is done. Traditional methods of signature-based detection are useless against the newer ransomware families as they exhibit polymorphic techniques and change their signatures frequently. This paper critically reviews some of the existing detection methods that use behavioural analysis using machine learning techniques. To test the efficiency and accuracy of various machine learning algorithms, logs from an infected windows machine were analysed using supervised machine learning algorithms to classify it as ransomware or non-ransomware. Secondly, the datasets were split into training and testing set to check the accuracy of the trained models and finally the most important behavioural features were determined that are most crucial in differentiating a log file from a ransomware infected machine to that of an uninfected machine.
UR - http://www.scopus.com/inward/record.url?scp=85128629020&partnerID=8YFLogxK
U2 - 10.12785/ijcds/110195
DO - 10.12785/ijcds/110195
M3 - Article
AN - SCOPUS:85128629020
VL - 11
SP - 1167
EP - 1176
JO - International Journal of Computing and Digital Systems
JF - International Journal of Computing and Digital Systems
SN - 2210-142X
IS - 1
ER -