Malware is a major security threat confronting computer systems and networks and has increased in scale and impact from the early days of ICT. Traditional protection mechanisms are largely incapable of dealing with the diversity and volume of malware variants which is evident today. This paper examines the evolution of malware including the nature of its activity and variants, and the implication of this for computer security industry practices.
As a first step to address this challenge, I propose a framework to extract features statically and dynamically from malware that reflect the behavior of its code such as the Windows Application Programming Interface (API) calls. Similarity based mining and machine learning methods have been employed to profile and classify malware behaviors. This method is based on the sequences of API sequence calls and frequency of appearance.
Experimental analysis results using large datasets show that the proposed method is effective in identifying known malware variants, and also classifies malware with high accuracy and low false alarm rates. This encouraging result indicates that classification is a viable approach for similarity detection to help detect malware. This work advances the detection of zero-day malware and offers researchers another method for understanding impact.