AmritaDGA: a comprehensive data set for domain generation algorithms (DGAs) based domain name detection systems and application of deep learning

R. Vinayakumar, K. P. Soman, PRABAHARAN Poornachandran, Mamoun Alazab, Sabu Thampi

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

In recent days, botnet plays an important role in malware distribution. This has been used as a primary approach for the proliferation of the malicious activities via the internet by attackers. To evade blacklisting, recent botnets make use of domain flux or internet protocol (IP) flux. This work focuses on domain flux. Domain flux uses domain generation algorithms (DGAs) to generate a list of domain names based on a seed and these domain names contacts command and control (C&C) server till it gets access permission to the system. This work presents the fully labeled domain name data set entitled as AmritaDGA which can be used for doing research in the field of detecting domain names which are generated using DGAs. We evaluate the efficacy of deep learning architectures with Keras embedding as domain name representation method on AmritaDGA. AmritaDGA is composed of two data sets. The first data set is collected from the publicly available sources. The second data set is collected from an internal real-time network. The performance of the trained model on public data set is evaluated on unseen samples of a public data set and private corpora. Deep learning architectures performed well in most of the cases of test experiments. The baseline system has been made publicly available and the data set is distributed for Detecting Malicious Domain names (DMD 2018) shared task.
Original languageEnglish
Title of host publicationBig Data Recommender Systems - Volume 2: Application Paradigms
EditorsOsman Khalid, Samee U. Khan, Albert Y. Zomaya
PublisherInstitution of Engineering and Technology (IET)
Chapter22
Pages455-485
Number of pages31
Volume2
ISBN (Electronic)9781785619786
ISBN (Print)9781785619779
DOIs
Publication statusPublished - 2019

Fingerprint Dive into the research topics of 'AmritaDGA: a comprehensive data set for domain generation algorithms (DGAs) based domain name detection systems and application of deep learning'. Together they form a unique fingerprint.

  • Cite this

    Vinayakumar, R., Soman, K. P., Poornachandran, PRABAHARAN., Alazab, M., & Thampi, S. (2019). AmritaDGA: a comprehensive data set for domain generation algorithms (DGAs) based domain name detection systems and application of deep learning. In O. Khalid, S. U. Khan, & A. Y. Zomaya (Eds.), Big Data Recommender Systems - Volume 2: Application Paradigms (Vol. 2, pp. 455-485). Institution of Engineering and Technology (IET). https://doi.org/10.1049/PBPC035G_ch22