TY - JOUR
T1 - A Comprehensive Survey for Intelligent Spam Email Detection
AU - Karim, Asif
AU - Azam, Sami
AU - Shanmugam, Bharanidharan
AU - Krishnan, Kannoorpatti
AU - Alazab, Mamoun
PY - 2019/11/20
Y1 - 2019/11/20
N2 - The tremendously growing problem of phishing e-mail, also known as spam including spear phishing or spam borne malware, has demanded a need for reliable intelligent anti-spam e-mail filters. This survey paper describes a focused literature survey of Artificial Intelligence (AI) and Machine Learning (ML) methods for intelligent spam email detection, which we believe can help in developing appropriate countermeasures. In this paper, we considered 4 parts in the email’s structure that can be used for intelligent analysis: (A) Headers Provide Routing Information, contain mail transfer agents (MTA) that provide information like email and IP address of each sender and recipient of where the email originated and what stopovers, and final destination. (B) The SMTP Envelope, containing mail exchangers’ identification, originating source and destination domains\users. (C) First part of SMTP Data, containing information like from, to, date, subject – appearing in most email clients (D) Second part of SMTP Data, containing email body including text content, and attachment. Based on the number the relevance of an emerging intelligent method, papers representing each method were identified, read, and summarized. Insightful findings, challenges and research problems are disclosed in this paper. This comprehensive survey paves the way for future research endeavors addressing theoretical and empirical aspects related to intelligent spam email detection.
AB - The tremendously growing problem of phishing e-mail, also known as spam including spear phishing or spam borne malware, has demanded a need for reliable intelligent anti-spam e-mail filters. This survey paper describes a focused literature survey of Artificial Intelligence (AI) and Machine Learning (ML) methods for intelligent spam email detection, which we believe can help in developing appropriate countermeasures. In this paper, we considered 4 parts in the email’s structure that can be used for intelligent analysis: (A) Headers Provide Routing Information, contain mail transfer agents (MTA) that provide information like email and IP address of each sender and recipient of where the email originated and what stopovers, and final destination. (B) The SMTP Envelope, containing mail exchangers’ identification, originating source and destination domains\users. (C) First part of SMTP Data, containing information like from, to, date, subject – appearing in most email clients (D) Second part of SMTP Data, containing email body including text content, and attachment. Based on the number the relevance of an emerging intelligent method, papers representing each method were identified, read, and summarized. Insightful findings, challenges and research problems are disclosed in this paper. This comprehensive survey paves the way for future research endeavors addressing theoretical and empirical aspects related to intelligent spam email detection.
KW - Machine learning
KW - phishing attack
KW - spam detection
KW - spam email
KW - spam filtering
KW - spear phishing
UR - http://www.scopus.com/inward/record.url?scp=85077778685&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2019.2954791
DO - 10.1109/ACCESS.2019.2954791
M3 - Article
AN - SCOPUS:85077778685
SN - 2169-3536
VL - 7
SP - 168261
EP - 168295
JO - IEEE Access
JF - IEEE Access
M1 - 08907831
ER -