A graph-theoretic approach for the detection of phishing webpages

Choon Lin Tan, Kang Leng Chiew, Kelvin S. C. Yong, San Nah Sze, Johari Abdullah, Yakub Sebastian

    Research output: Contribution to journalArticlepeer-review

    157 Downloads (Pure)


    Over the years, various technical means have been developed to protect Internet users from phishing attacks. To enrich the anti-phishing efforts, we capitalise on concepts from graph theories, and propose a set of novel graph features to improve the phishing detection accuracy. The initial phase of the proposed technique involved the extraction of hyperlinks in the webpage under scrutiny and fetching the corresponding neighbourhood webpages. During this process, the page linking data were collected, and used to construct a web graph which models the overall hyperlink and network structure of the webpage. From the web graph, graph measures were computed and extracted as graph features to derive a classifier for detecting phishing webpages. Experimental results show that the proposed graph features achieve an improved overall accuracy of 97.8% when C4.5 was utilised as classifier, outperforming the existing conventional features derived from the same data samples. Unlike conventional features, the proposed graph features leverage inherent phishing patterns that are only visible at a higher level of abstraction, thus making it robust and difficult to be evaded by direct manipulations on the webpage contents. Our proposed graph-based technique also shows promising results when benchmarked against a prominent phishing detection technique. Hence, the proposed technique is an important contribution to the existing anti-phishing research towards improving the detection performance.

    Original languageEnglish
    Article number101793
    Pages (from-to)1-14
    Number of pages14
    JournalComputers and Security
    Early online date8 Aug 2020
    Publication statusPublished - Aug 2020


    Dive into the research topics of 'A graph-theoretic approach for the detection of phishing webpages'. Together they form a unique fingerprint.

    Cite this