Design of detection method for malicious URL based on Deep Neural Network

Kwon, Hyun;Park, Sangjun;Kim, Yongchul;

doi:10.22156/CS4SMB.2021.11.05.030

Journal of Convergence for Information Technology (융합정보논문지)

Volume 11 Issue 5
/
Pages.30-37
/
2021
/
2586-4440(eISSN)

Convergence Society for SMB (중소기업융합학회)

DOI QR Code

Design of detection method for malicious URL based on Deep Neural Network

뉴럴네트워크 기반에 악성 URL 탐지방법 설계

Kwon, Hyun (Department of Electrical Engineering, Korea Military Academy) ;
Park, Sangjun (Department of Electrical Engineering, Korea Military Academy) ;
Kim, Yongchul (Department of Electrical Engineering, Korea Military Academy)

권현 (육군사관학교 전자공학과) ;
박상준 (육군사관학교 전자공학과) ;
김용철 (육군사관학교 전자공학과)

Received : 2021.04.15
Accepted : 2021.05.20
Published : 2021.05.28

https://doi.org/10.22156/CS4SMB.2021.11.05.030 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Various devices are connected to the Internet, and attacks using the Internet are occurring. Among such attacks, there are attacks that use malicious URLs to make users access to wrong phishing sites or distribute malicious viruses. Therefore, how to detect such malicious URL attacks is one of the important security issues. Among recent deep learning technologies, neural networks are showing good performance in image recognition, speech recognition, and pattern recognition. This neural network can be applied to research that analyzes and detects patterns of malicious URL characteristics. In this paper, performance analysis according to various parameters was performed on a method of detecting malicious URLs using neural networks. In this paper, malicious URL detection performance was analyzed while changing the activation function, learning rate, and neural network structure. The experimental data was crawled by Alexa top 1 million and Whois to build the data, and the machine learning library used TensorFlow. As a result of the experiment, when the number of layers is 4, the learning rate is 0.005, and the number of nodes in each layer is 100, the accuracy of 97.8% and the f1 score of 92.94% are obtained.

사물인터넷 등을 통하여 각종 기기들이 인터넷으로 연결되어 있고 이로 인하여 인터넷을 이용한 공격이 발생하고 있다. 그러한 공격 중 악성 URL를 이용하여 사용자에게 잘못된 피싱 사이트로 접속하게 하거나 악성 바이러스를 유포하는 공격들이 있다. 이러한 악성 URL 공격을 탐지하는 방법은 중요한 보안 이슈 중에 하나이다. 최근 딥러닝 기술 중 뉴럴네트워크는 이미지 인식, 음성 인식, 패턴 인식 등에 좋은 성능을 보여주고 있고 이러한 뉴럴네트워크를 이용하여 악성 URL 탐지하는 분야가 연구되고 있다. 본 논문에서는 뉴럴네트워크를 이용한 악성 URL 탐지 성능을 각 파라미터 및 구조에 따라서 성능을 분석하였다. 뉴럴네트워크의 활성화함수, 학습률, 뉴럴네트워크 모델 등 다양한 요소들에 따른 악성 URL 탐지 성능에 어떠한 영향을 미치는 지 분석하였다. 실험 데이터는 Alexa top 1 million과 Whois에서 크롤링하여 데이터를 구축하였고 머신러닝 라이브러리는 텐서플로우를 사용하였다. 실험결과로 층의 개수가 4개이고 학습률이 0.005이고 각 층마다 노드의 개수가 100개 일 때, 97.8%의 accuracy와 92.94%의 f1 score를 갖는 것을 볼 수 있었다.

Keywords

Acknowledgement

This work was supported by 2021 (21-center-2) research fund of Korea Military Academy (Cyber Warfare Research Center).

References

P. Zhao & S. C. Hoi. (2013, August). Cost-sensitive online active learning with application to malicious URL detection. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 919-927). DOI : 10.1145/2487575.2487647
F. Yu. (2015). Malicious url detection algorithm based on bm pattern matching. International Journal of Security and Its Applications, 9(9), 33-44. https://doi.org/10.14257/ijsia.2015.9.9.04
J. Klensin. (2003). Role of the domain name system (dns). Internet Request for Comments: RFC, 3467.
M. Anthony & P. L. Bartlett. (2009). Neural network learning: Theoretical foundations. cambridge university press.
S. Yadav, A. K. K. Reddy, A. N. Reddy & S. Ranjan. (2012). Detecting algorithmically generated domain-flux attacks with DNS traffic analysis. IEEE/Acm Transactions on Networking, 20(5), 1663-1677. DOI : 10.1109/TNET.2012.2184552
L. Dolberg, J. Francois & T. Engel. (2012). Efficient multidimensional aggregation for large scale monitoring. In 26th Large Installation System Administration Conference ({LISA} 12) (pp. 163-180).
Y. Shi, G. Chen & J. Li. (2018). Malicious domain name detection based on extreme machine learning. Neural Processing Letters, 48(3), 1347-1357. DOI : 10.1007/s11063-017-9666-7
X. Sun, M. Tong, J. Yang, L. Xinran & L. Heng. (2019). Hindom: A robust malicious domain detection system based on heterogeneous information network with transductive classification. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses ({RAID} 2019) (pp. 399-412).
L. Bilge, S. Sen, D. Balzarotti, E. Kirda & C. Kruegel. (2014). Exposure: A passive dns analysis service to detect and report malicious domains. ACM Transactions on Information and System Security (TISSEC), 16(4), 1-28. DOI : 10.1145/2584679
B. Rahbarinia, R. Perdisci & M. Antonakakis. (2016). Efficient and accurate behavior-based tracking of malware-control domains in large ISP networks. ACM Transactions on Privacy and Security (TOPS), 19(2), 1-31. DOI : 10.1145/2960409
J. Yuan, G. Chen, S. Tian & X. Pei. (2021). Malicious URL Detection Based on a Parallel Neural Joint Model. IEEE Access, 9, 9464-9472. DOI : 10.1109/ACCESS.2021.3049625.
R. Patgiri, A. Biswas & S. Nayak. (2021). deepBF: Malicious URL detection using Learned Bloom Filter and Evolutionary Deep Learning. arXiv preprint arXiv:2103.12544.
B. M. Kim, Y. W. Han, G. Y. Kim, Y. B. Kim & H. J. Kim. (2020). Development of Rule-Based Malicious URL Detection Library Considering User Experiences. Journal of the Korea Institute of Information Security & Cryptology, 30(3), 481-491. DOI : 10.13089/JKIISC.2020.30.3.481
D. F. Specht. (1990). Probabilistic neural networks. Neural networks, 3(1), 109-118. https://doi.org/10.1016/0893-6080(90)90049-Q
D. M. Kline & V. L. Berardi. (2005). Revisiting squared-error and cross-entropy functions for training neural network classifiers. Neural Computing & Applications, 14(4), 310-318. DOI : 10.1007/s00521-005-0467-y
S. Du et al. (2019, May). Gradient descent finds global minima of deep neural networks. In International Conference on Machine Learning (pp. 1675-1685). PMLR.
M. Abadi et al. (2016). Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16) (pp. 265-283).
https://www.alexa.com
https://gnso.icann.org
N. Hason, A. Dvir & C. Hajaj. (2020, July). Robust Malicious Domain Detection. In International Symposium on Cyber Security Cryptography and Machine Learning (pp. 45-61). Springer, Cham. DOI : 10.1007/978-3-030-49785-9_4
D. P. Kingma & J. Ba. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
L. Bottou. (2010). Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010 (pp. 177-186). Physica-Verlag HD. DOI : 10.1007/978-3-7908-2604-3_16
A. Creswell et al. (2018). Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 35(1), 53-65. DOI : 10.1109/MSP.2017.2765202
E. Kodirov, T. Xiang & S. Gong. (2017). Semantic autoencoder for zero-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3174-3183).
H. Kwon, H. Yoon & D. Choi. (2019). Restricted evasion attack: Generation of restricted-area adversarial example. IEEE Access, 7, 60908-60919. DOI : 10.1109/ACCESS.2019.2915971
H. Kwon, Y. Kim, H. Yoon & D. Choi. (2018). Random untargeted adversarial example on deep neural network. Symmetry, 10(12), 738. DOI : 10.3390/sym10120738
H. Kwon, H. Yoon & K. W. Park. (2019, November). POSTER: Detecting audio adversarial example through audio modification. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (pp. 2521-2523). DOI : 10.1145/3319535.3363246
H. Kwon, Y. Kim, K. W. Park, H. Yoon & D. Choi. (2018). Advanced ensemble adversarial example on unknown deep neural network classifiers. IEICE TRANSACTIONS on Information and Systems, 101(10), 2485-2500. DOI : 10.1587/transinf.2018EDP7073
H. Kwon, H. Yoon & K. W. Park. (2020). Acoustic-decoy: Detection of adversarial examples through audio modification on speech recognition system. Neurocomputing, 417, 357-370. DOI : 10.1016/j.neucom.2020.07.101

Journal of Convergence for Information Technology (융합정보논문지)

Design of detection method for malicious URL based on Deep Neural Network

뉴럴네트워크 기반에 악성 URL 탐지방법 설계

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)