In recent days, botnet plays an important role in malware distribution. This has been used as a primary approach for the proliferation of the malicious activities via the internet by attackers. To evade blacklisting, recent botnets make use of domain flux or internet protocol (IP) flux. This work focuses on domain flux. Domain flux uses domain generation algorithms (DGAs) to generate a list of domain names based on a seed and these domain names contacts command and control (C&C) server till it gets access permission to the system. This work presents the fully labeled domain name data set entitled as AmritaDGA which can be used for doing research in the field of detecting domain names which are generated using DGAs. We evaluate the efficacy of deep learning architectures with Keras embedding as domain name representation method on AmritaDGA. AmritaDGA is composed of two data sets. The first data set is collected from the publicly available sources. The second data set is collected from an internal real-time network. The performance of the trained model on public data set is evaluated on unseen samples of a public data set and private corpora. Deep learning architectures performed well in most of the cases of test experiments. The baseline system has been made publicly available and the data set is distributed for Detecting Malicious Domain names (DMD 2018) shared task.
|Title of host publication||Big Data Recommender Systems - Volume 2: Application Paradigms|
|Editors||Osman Khalid, Samee U. Khan, Albert Y. Zomaya|
|Publisher||Institution of Engineering and Technology (IET)|
|Number of pages||31|
|Publication status||Published - 2019|