Study on Machine Learning Techniques for Malware Classification and Detection

Moon, Jaewoong;Kim, Subin;Song, Jaeseung;Kim, Kyungshin;

doi:10.3837/tiis.2021.12.003

KSII Transactions on Internet and Information Systems (TIIS)

Volume 15 Issue 12
/
Pages.4308-4325
/
2021
/
1976-7277(pISSN)
/
1976-7277(eISSN)

Korean Society for Internet Information (한국인터넷정보학회)

DOI QR Code

Study on Machine Learning Techniques for Malware Classification and Detection

Moon, Jaewoong (Sejong University) ;
Kim, Subin (Sejong University) ;
Song, Jaeseung (Sejong University) ;
Kim, Kyungshin (Convergence Technology Collaboration Directorate, Agency for Defense Development)

Received : 2021.07.08
Accepted : 2021.10.08
Published : 2021.12.31

https://doi.org/10.3837/tiis.2021.12.003 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The importance and necessity of artificial intelligence, particularly machine learning, has recently been emphasized. In fact, artificial intelligence, such as intelligent surveillance cameras and other security systems, is used to solve various problems or provide convenience, providing solutions to problems that humans traditionally had to manually deal with one at a time. Among them, information security is one of the domains where the use of artificial intelligence is especially needed because the frequency of occurrence and processing capacity of dangerous codes exceeds the capabilities of humans. Therefore, this study intends to examine the definition of artificial intelligence and machine learning, its execution method, process, learning algorithm, and cases of utilization in various domains, particularly the cases and contents of artificial intelligence technology used in the field of information security. Based on this, this study proposes a method to apply machine learning technology to the method of classifying and detecting malware that has rapidly increased in recent years. The proposed methodology converts software programs containing malicious codes into images and creates training data suitable for machine learning by preparing data and augmenting the dataset. The model trained using the images created in this manner is expected to be effective in classifying and detecting malware.

Keywords

References

AV-TEST GmbH, https://www.av-test.org/en/statistics/malware/
T. M. M. M. I. Jordan, "Machine learning: Trends, perspectives, and prospects," Science, vol. 349, Issue 6245, pp 255-260, Jul. 2015. https://doi.org/10.1126/science.aaa8415
K. W. Kug, "Cases of application by artificial intelligence technology and industry," IITP, 2019.
Z.-K. Zhang, "IoT Security: Ongoing Challenges and Research Opportunities," in Proc. of 2014 IEEE 7th International Conference on Service-Oriented Computing and Applications, pp 230-234, Nov. 2014.
D. L. JS Luo, "Binary malware image classification using machine learning with local binary pattern," in Proc. of IEEE International Conference on Big Data, pp 4664-4667, Dec. 2017.
I. S. Oh, "Machine Learning," in Seoul, KOREA: Hanbit, 2017.
Z. Y. I Muhammad, "SUPERVISED MACHINE LEARNING APPROACHES: A SURVEY," ICTACT Journal on Soft Computing, vol. 5, pp. 946-952, May. 2015. https://doi.org/10.21917/ijsc.2015.0133
H Paulheim, R Meusel, "A decomposition of the outlier detection problem into a set of supervised learning problems," Machine Learning, vol. 100, pp 509-531, Jun. 2015. https://doi.org/10.1007/s10994-015-5507-y
B. P. B. S. HP Vinutha, "Detection of Outliers Using Interquartile Range Technique from Intrusion Dataset," Information and Decision Sciences, vol. 701, pp 511-518, Apr. 2018. https://doi.org/10.1007/978-981-10-7563-6_53
R. F. P. G. AI Karoly, "Unsupervised clustering for deep learning: A tutorial survey," Acta Polytechnica Hungarica, vol. 15, pp 29-53, Aug. 2018.
S. Y. Jang, H. J. Yoon, N. S. Park, "Research Trends on Deep Reinforcement Learning," ETRI, vol. 34, Issue 4, pp 1-14, Aug. 2019.
M. Abadi, "TensorFlow: learning functions at scale," in Proc. of ICFP 2016: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, vol. 51, pp 1-1, Sep. 2016.
G. V. A. G. F Pedregosa, "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, 12, 2825-2830, Oct. 2011.
E. S. J. D. Yangqing Jia, "Caffe: Convolutional Architecture for Fast Feature Embedding," in Proc. of the 22nd ACM international conference on Multimedia, pp. 675-678, Nov. 2014.
A. A. Frank Seide, "CNTK: Microsoft's Open-Source Deep-Learning Toolkit," in Proc. of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 2135, Aug. 2016.
Z. Y., I Muhammad, "Supervised Machine Learning Approaches: A Survey," ICTACT Journal on Soft Computing, vol. 5, Issue. 03, pp 946-952, Apr. 2015. https://doi.org/10.21917/ijsc.2015.0133
J. H. Kim, N. Aziz, "An Enhanced DBSCAN Algorithm to Consider Various Density Distributions for Educational Data," KACE, vol. 22, pp 41-44, Jan. 2018.
BRUNDAGE, Miles, et al, "The malicious use of artificial intelligence: Forecasting, prevention, and mitigation," arXiv preprint arXiv:1802.07228, Feb. 2018.
Malwarebytes Labs, 2020 State of Malware, 2020, [Online] Available: https://www.malwarebytes.com/resources/files/2020/02/2020_state-of-malware-report-1.pdf
KISA, "KISA Cyber Security Issue Report : Q3 2020," pp 1-54, Oct. 2020.
S. W. LEE, J. Y. PARK, S. W. LEE, "Low resolution face recognition based on support vector data description," Pattern Recognition, vol. 39, Issue. 9, pp. 1809-1812, Sep. 2006. https://doi.org/10.1016/j.patcog.2006.04.033
NATARAJ Lakshmanan, MANJUNATH, B. S, "SPAM: Signal processing to analyze malware [applications corner]," IEEE Signal Processing Magazine, vol. 33, no. 2, pp 105-117, Mar. 2016. https://doi.org/10.1109/MSP.2015.2507185
"scikit-learn.org," [Online]. Available: https://scikitlearn.org/stable/auto_examples/cluster/plot_kmeans_digits.html.
A. Sharma, "Advances in Computational Imaging: Theory, Algorithms, and Systems," Mathematical Problems in Engineering, vol. 2017, pp 9, Feb. 2017.
C. Shorten, T.M. Khoshgoftaar, "A survey on Image Data Augmentation for Deep Learning," J Big Data, 6, no. 60, pp 1-48, Jul. 2019. https://doi.org/10.1186/s40537-018-0162-3
M. Kalash, M. Rochan, N. Mohammed, N. D. B. Bruce, Y. Wang and F. Iqbal, "Malware Classification with Deep Convolutional Neural Networks," in Proc. of 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), pp 1-5, Feb. 2018.
J. Zhang, Z. Qin, H. Yin, L. Ou and Y. Hu, "IRMD: Malware Variant Detection Using Opcode Image Recognition," in Proc. of 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), pp. 1175-1180, Dec. 2016.

KSII Transactions on Internet and Information Systems (TIIS)

Study on Machine Learning Techniques for Malware Classification and Detection

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)