A Comprehensive Survey for Intelligent Spam Email Detection

Research output: Contribution to journalArticleResearchpeer-review

1 Downloads (Pure)

Abstract

The tremendously growing problem of phishing e-mail, also known as spam including spear phishing or spam borne malware, has demanded a need for reliable intelligent anti-spam e-mail filters. This survey paper describes a focused literature survey of Artificial Intelligence (AI) and Machine Learning (ML) methods for intelligent spam email detection, which we believe can help in developing appropriate countermeasures. In this paper, we considered 4 parts in the email’s structure that can be used for intelligent analysis: (A) Headers Provide Routing Information, contain mail transfer agents (MTA) that provide information like email and IP address of each sender and recipient of where the email originated and what stopovers, and final destination. (B) The SMTP Envelope, containing mail exchangers’ identification, originating source and destination domains\users. (C) First part of SMTP Data, containing information like from, to, date, subject – appearing in most email clients (D) Second part of SMTP Data, containing email body including text content, and attachment. Based on the number the relevance of an emerging intelligent method, papers representing each method were identified, read, and summarized. Insightful findings, challenges and research problems are disclosed in this paper. This comprehensive survey paves the way for future research endeavors addressing theoretical and empirical aspects related to intelligent spam email detection.
Original languageEnglish
Article number08907831
Pages (from-to)168261-168295
Number of pages35
JournalIEEE Access
Volume7
DOIs
Publication statusPublished - 20 Nov 2019

Fingerprint

Electronic mail
Artificial intelligence
Learning systems

Cite this

@article{18ee50e9c9e1418180a43b9458d214f6,
title = "A Comprehensive Survey for Intelligent Spam Email Detection",
abstract = "The tremendously growing problem of phishing e-mail, also known as spam including spear phishing or spam borne malware, has demanded a need for reliable intelligent anti-spam e-mail filters. This survey paper describes a focused literature survey of Artificial Intelligence (AI) and Machine Learning (ML) methods for intelligent spam email detection, which we believe can help in developing appropriate countermeasures. In this paper, we considered 4 parts in the email’s structure that can be used for intelligent analysis: (A) Headers Provide Routing Information, contain mail transfer agents (MTA) that provide information like email and IP address of each sender and recipient of where the email originated and what stopovers, and final destination. (B) The SMTP Envelope, containing mail exchangers’ identification, originating source and destination domains\users. (C) First part of SMTP Data, containing information like from, to, date, subject – appearing in most email clients (D) Second part of SMTP Data, containing email body including text content, and attachment. Based on the number the relevance of an emerging intelligent method, papers representing each method were identified, read, and summarized. Insightful findings, challenges and research problems are disclosed in this paper. This comprehensive survey paves the way for future research endeavors addressing theoretical and empirical aspects related to intelligent spam email detection.",
keywords = "Machine learning, spear phishing, Spam detection, Spam filtering, Phishing attack",
author = "Asif Karim and Sami Azam and Bharanidharan Shanmugam and Kannoorpatti Krishnan and Mamoun Alazab",
year = "2019",
month = "11",
day = "20",
doi = "10.1109/ACCESS.2019.2954791",
language = "English",
volume = "7",
pages = "168261--168295",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",

}

A Comprehensive Survey for Intelligent Spam Email Detection. / Karim, Asif; Azam, Sami; Shanmugam, Bharanidharan; Krishnan, Kannoorpatti; Alazab, Mamoun.

In: IEEE Access, Vol. 7, 08907831, 20.11.2019, p. 168261-168295.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A Comprehensive Survey for Intelligent Spam Email Detection

AU - Karim, Asif

AU - Azam, Sami

AU - Shanmugam, Bharanidharan

AU - Krishnan, Kannoorpatti

AU - Alazab, Mamoun

PY - 2019/11/20

Y1 - 2019/11/20

N2 - The tremendously growing problem of phishing e-mail, also known as spam including spear phishing or spam borne malware, has demanded a need for reliable intelligent anti-spam e-mail filters. This survey paper describes a focused literature survey of Artificial Intelligence (AI) and Machine Learning (ML) methods for intelligent spam email detection, which we believe can help in developing appropriate countermeasures. In this paper, we considered 4 parts in the email’s structure that can be used for intelligent analysis: (A) Headers Provide Routing Information, contain mail transfer agents (MTA) that provide information like email and IP address of each sender and recipient of where the email originated and what stopovers, and final destination. (B) The SMTP Envelope, containing mail exchangers’ identification, originating source and destination domains\users. (C) First part of SMTP Data, containing information like from, to, date, subject – appearing in most email clients (D) Second part of SMTP Data, containing email body including text content, and attachment. Based on the number the relevance of an emerging intelligent method, papers representing each method were identified, read, and summarized. Insightful findings, challenges and research problems are disclosed in this paper. This comprehensive survey paves the way for future research endeavors addressing theoretical and empirical aspects related to intelligent spam email detection.

AB - The tremendously growing problem of phishing e-mail, also known as spam including spear phishing or spam borne malware, has demanded a need for reliable intelligent anti-spam e-mail filters. This survey paper describes a focused literature survey of Artificial Intelligence (AI) and Machine Learning (ML) methods for intelligent spam email detection, which we believe can help in developing appropriate countermeasures. In this paper, we considered 4 parts in the email’s structure that can be used for intelligent analysis: (A) Headers Provide Routing Information, contain mail transfer agents (MTA) that provide information like email and IP address of each sender and recipient of where the email originated and what stopovers, and final destination. (B) The SMTP Envelope, containing mail exchangers’ identification, originating source and destination domains\users. (C) First part of SMTP Data, containing information like from, to, date, subject – appearing in most email clients (D) Second part of SMTP Data, containing email body including text content, and attachment. Based on the number the relevance of an emerging intelligent method, papers representing each method were identified, read, and summarized. Insightful findings, challenges and research problems are disclosed in this paper. This comprehensive survey paves the way for future research endeavors addressing theoretical and empirical aspects related to intelligent spam email detection.

KW - Machine learning

KW - spear phishing

KW - Spam detection

KW - Spam filtering

KW - Phishing attack

U2 - 10.1109/ACCESS.2019.2954791

DO - 10.1109/ACCESS.2019.2954791

M3 - Article

VL - 7

SP - 168261

EP - 168295

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

M1 - 08907831

ER -