Search | Korea Science

Malware Detection Using Deep Recurrent Neural Networks with no Random Initialization

Amir Namavar Jahromi;Sattar Hashemi
- International Journal of Computer Science & Network Security
- /
- v.23 no.8
- /
- pp.177-189
- /
- 2023
Malware detection is an increasingly important operational focus in cyber security, particularly given the fast pace of such threats (e.g., new malware variants introduced every day). There has been great interest in exploring the use of machine learning techniques in automating and enhancing the effectiveness of malware detection and analysis. In this paper, we present a deep recurrent neural network solution as a stacked Long Short-Term Memory (LSTM) with a pre-training as a regularization method to avoid random network initialization. In our proposal, we use global and short dependencies of the inputs. With pre-training, we avoid random initialization and are able to improve the accuracy and robustness of malware threat hunting. The proposed method speeds up the convergence (in comparison to stacked LSTM) by reducing the length of malware OpCode or bytecode sequences. Hence, the complexity of our final method is reduced. This leads to better accuracy, higher Mattews Correlation Coefficients (MCC), and Area Under the Curve (AUC) in comparison to a standard LSTM with similar detection time. Our proposed method can be applied in real-time malware threat hunting, particularly for safety critical systems such as eHealth or Internet of Military of Things where poor convergence of the model could lead to catastrophic consequences. We evaluate the effectiveness of our proposed method on Windows, Ransomware, Internet of Things (IoT), and Android malware datasets using both static and dynamic analysis. For the IoT malware detection, we also present a comparative summary of the performance on an IoT-specific dataset of our proposed method and the standard stacked LSTM method. More specifically, of our proposed method achieves an accuracy of 99.1% in detecting IoT malware samples, with AUC of 0.985, and MCC of 0.95; thus, outperforming standard LSTM based methods in these key metrics.
https://doi.org/10.22937/IJCSNS.2023.23.8.21 인용 PDF

Research on Malware Classification with Network Activity for Classification and Attack Prediction of Attack Groups (공격그룹 분류 및 예측을 위한 네트워크 행위기반 악성코드 분류에 관한 연구)

Lim, Hyo-young;Kim, Wan-ju;Noh, Hong-jun;Lim, Jae-sung
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.42 no.1
- /
- pp.193-204
- /
- 2017
The security of Internet systems critically depends on the capability to keep anti-virus (AV) software up-to-date and maintain high detection accuracy against new malware. However, malware variants evolve so quickly they cannot be detected by conventional signature-based detection. In this paper, we proposed a malware classification method based on sequence patterns generated from the network flow of malware samples. We evaluated our method with 766 malware samples and obtained a classification accuracy of approximately 40.4%. In this study, malicious codes were classified only by network behavior of malicious codes, excluding codes and other characteristics. Therefore, this study is expected to be further developed in the future. Also, we can predict the attack groups and additional attacks can be prevented.
https://doi.org/10.7840/kics.2017.42.1.193 인용 PDF KSCI

A Study on Robustness Evaluation and Improvement of AI Model for Malware Variation Analysis (악성코드 변종 분석을 위한 AI 모델의 Robust 수준 측정 및 개선 연구)

Lee, Eun-gyu;Jeong, Si-on;Lee, Hyun-woo;Lee, Tea-jin
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.32 no.5
- /
- pp.997-1008
- /
- 2022
Today, AI(Artificial Intelligence) technology is being extensively researched in various fields, including the field of malware detection. To introduce AI systems into roles that protect important decisions and resources, it must be a reliable AI model. AI model that dependent on training dataset should be verified to be robust against new attacks. Rather than generating new malware detection, attackers find malware detection that succeed in attacking by mass-producing strains of previously detected malware detection. Most of the attacks, such as adversarial attacks, that lead to misclassification of AI models, are made by slightly modifying past attacks. Robust models that can be defended against these variants is needed, and the Robustness level of the model cannot be evaluated with accuracy and recall, which are widely used as AI evaluation indicators. In this paper, we experiment a framework to evaluate robustness level by generating an adversarial sample based on one of the adversarial attacks, C&W attack, and to improve robustness level through adversarial training. Through experiments based on malware dataset in this study, the limitations and possibilities of the proposed method in the field of malware detection were confirmed.
https://doi.org/10.13089/JKIISC.2022.32.5.997 인용 PDF KSCI HTML

Recent Advances in Cryptovirology: State-of-the-Art Crypto Mining and Crypto Ransomware Attacks

Zimba, Aaron;Wang, Zhaoshun;Chen, Hongsong;Mulenga, Mwenge
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.6
- /
- pp.3258-3279
- /
- 2019
Recently, ransomware has earned itself an infamous reputation as a force to reckon with in the cybercrime landscape. However, cybercriminals are adopting other unconventional means to seamlessly attain proceeds of cybercrime with little effort. Cybercriminals are now acquiring cryptocurrencies directly from benign Internet users without the need to extort a ransom from them, as is the case with ransomware. This paper investigates advances in the cryptovirology landscape by examining the state-of-the-art cryptoviral attacks. In our approach, we perform digital autopsy on the malware's source code and execute the different malware variants in a contained sandbox to deduce static and dynamic properties respectively. We examine three cryptoviral attack structures: browser-based crypto mining, memory resident crypto mining and cryptoviral extortion. These attack structures leave a trail of digital forensics evidence when the malware interacts with the file system and generates noise in form of network traffic when communicating with the C2 servers and crypto mining pools. The digital forensics evidence, which essentially are IOCs include network artifacts such as C2 server domains, IPs and cryptographic hash values of the downloaded files apart from the malware hash values. Such evidence can be used as seed into intrusion detection systems for mitigation purposes.
https://doi.org/10.3837/tiis.2019.06.027 인용 PDF KSCI HTML

Generating Call Graph for PE file (PE 파일 분석을 위한 함수 호출 그래프 생성 연구)

Kim, DaeYoub
- Journal of IKEEE
- /
- v.25 no.3
- /
- pp.451-461
- /
- 2021
As various smart devices spread and the damage caused by malicious codes becomes more serious, malicious code detection technology using machine learning technology is attracting attention. However, if the training data of machine learning is constructed based on only the fragmentary characteristics of the code, it is still easy to create variants and new malicious codes that avoid it. To solve such a problem, a research using the function call relationship of malicious code as training data is attracting attention. In particular, it is expected that more advanced malware detection will be possible by measuring the similarity of graphs using GNN. This paper proposes an efficient method to generate a function call graph from binary code to utilize GNN for malware detection.
https://doi.org/10.7471/ikeee.2021.25.3.451 인용 PDF KSCI

Fast k-NN based Malware Analysis in a Massive Malware Environment

Hwang, Jun-ho;Kwak, Jin;Lee, Tae-jin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.12
- /
- pp.6145-6158
- /
- 2019
It is a challenge for the current security industry to respond to a large number of malicious codes distributed indiscriminately as well as intelligent APT attacks. As a result, studies using machine learning algorithms are being conducted as proactive prevention rather than post processing. The k-NN algorithm is widely used because it is intuitive and suitable for handling malicious code as unstructured data. In addition, in the malicious code analysis domain, the k-NN algorithm is easy to classify malicious codes based on previously analyzed malicious codes. For example, it is possible to classify malicious code families or analyze malicious code variants through similarity analysis with existing malicious codes. However, the main disadvantage of the k-NN algorithm is that the search time increases as the learning data increases. We propose a fast k-NN algorithm which improves the computation speed problem while taking the value of the k-NN algorithm. In the test environment, the k-NN algorithm was able to perform with only the comparison of the average of similarity of 19.71 times for 6.25 million malicious codes. Considering the way the algorithm works, Fast k-NN algorithm can also be used to search all data that can be vectorized as well as malware and SSDEEP. In the future, it is expected that if the k-NN approach is needed, and the central node can be effectively selected for clustering of large amount of data in various environments, it will be possible to design a sophisticated machine learning based system.
https://doi.org/10.3837/tiis.2019.12.019 인용 PDF KSCI HTML

A Study on Classification of CNN-based Linux Malware using Image Processing Techniques (영상처리기법을 이용한 CNN 기반 리눅스 악성코드 분류 연구)

Kim, Se-Jin;Kim, Do-Yeon;Lee, Hoo-Ki;Lee, Tae-Jin
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.21 no.9
- /
- pp.634-642
- /
- 2020
With the proliferation of Internet of Things (IoT) devices, using the Linux operating system in various architectures has increased. Also, security threats against Linux-based IoT devices are increasing, and malware variants based on existing malware are constantly appearing. In this paper, we propose a system where the binary data of a visualized Executable and Linkable Format (ELF) file is applied to Local Binary Pattern (LBP) image processing techniques and a median filter to classify malware in a Convolutional Neural Network (CNN). As a result, the original image showed the highest accuracy and F1-score at 98.77%, and reproducibility also showed the highest score at 98.55%. For the median filter, the highest precision was 99.19%, and the lowest false positive rate was 0.008%. Using the LBP technique confirmed that the overall result was lower than putting the original ELF file through the median filter. When the results of putting the original file through image processing techniques were classified by majority, it was confirmed that the accuracy, precision, F1-score, and false positive rate were better than putting the original file through the median filter. In the future, the proposed system will be used to classify malware families or add other image processing techniques to improve the accuracy of majority vote classification. Or maybe we mean "the use of Linux O/S distributions for various architectures has increased" instead? If not, please rephrase as intended.
https://doi.org/10.5762/KAIS.2020.21.9.634 인용 PDF KSCI

Research on the Classification Model of Similarity Malware using Fuzzy Hash (퍼지해시를 이용한 유사 악성코드 분류모델에 관한 연구)

Park, Changwook;Chung, Hyunji;Seo, Kwangseok;Lee, Sangjin
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.22 no.6
- /
- pp.1325-1336
- /
- 2012
In the past about 10 different kinds of malicious code were found in one day on the average. However, the number of malicious codes that are found has rapidly increased reachingover 55,000 during the last 10 year. A large number of malicious codes, however, are not new kinds of malicious codes but most of them are new variants of the existing malicious codes as same functions are newly added into the existing malicious codes, or the existing malicious codes are modified to evade anti-virus detection. To deal with a lot of malicious codes including new malicious codes and variants of the existing malicious codes, we need to compare the malicious codes in the past and the similarity and classify the new malicious codes and the variants of the existing malicious codes. A former calculation method of the similarity on the existing malicious codes compare external factors of IPs, URLs, API, Strings, etc or source code levels. The former calculation method of the similarity takes time due to the number of malicious codes and comparable factors on the increase, and it leads to employing fuzzy hashing to reduce the amount of calculation. The existing fuzzy hashing, however, has some limitations, and it causes come problems to the former calculation of the similarity. Therefore, this research paper has suggested a new comparison method for malicious codes to improve performance of the calculation of the similarity using fuzzy hashing and also a classification method employing the new comparison method.
https://doi.org/10.13089/JKIISC.2012.22.6.1325 인용 PDF KSCI HTML

Collaborative security response by interworking between multiple security solutions (보안 솔루션의 상호 연동을 통한 실시간 협력 대응 방안 연구)

Kim, JiHoon;Lim, Jong In;Kim, Huy Kang
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.23 no.1
- /
- pp.69-79
- /
- 2013
Recently, many enterprises are suffering from advanced types of malware and their variants including intelligent malware that can evade the current security systems. This addresses the fact that current security systems have limits on protecting advanced and intelligent security threats. To enhance the overall level of security, first of all, it needs to increase detection ratio of each security solution within a security system. In addition, it is also necessary to implement internetworking between multiple security solutions to increase detection ratio and response speed. In this paper, we suggest a collaborative security response method to overcome the limitations of the previous Internet service security solutions. The proposed method can show an enhanced result to respond to intelligent security threats.
https://doi.org/10.13089/JKIISC.2013.23.1.069 인용 PDF KSCI HTML

A Study on the Cerber-Type Ransomware Detection Model Using Opcode and API Frequency and Correlation Coefficient (Opcode와 API의 빈도수와 상관계수를 활용한 Cerber형 랜섬웨어 탐지모델에 관한 연구)

Lee, Gye-Hyeok;Hwang, Min-Chae;Hyun, Dong-Yeop;Ku, Young-In;Yoo, Dong-Young
- KIPS Transactions on Computer and Communication Systems
- /
- v.11 no.10
- /
- pp.363-372
- /
- 2022
Since the recent COVID-19 Pandemic, the ransomware fandom has intensified along with the expansion of remote work. Currently, anti-virus vaccine companies are trying to respond to ransomware, but traditional file signature-based static analysis can be neutralized in the face of diversification, obfuscation, variants, or the emergence of new ransomware. Various studies are being conducted for such ransomware detection, and detection studies using signature-based static analysis and behavior-based dynamic analysis can be seen as the main research type at present. In this paper, the frequency of ".text Section" Opcode and the Native API used in practice was extracted, and the association between feature information selected using K-means Clustering algorithm, Cosine Similarity, and Pearson correlation coefficient was analyzed. In addition, Through experiments to classify and detect worms among other malware types and Cerber-type ransomware, it was verified that the selected feature information was specialized in detecting specific ransomware (Cerber). As a result of combining the finally selected feature information through the above verification and applying it to machine learning and performing hyper parameter optimization, the detection rate was up to 93.3%.
https://doi.org/10.3745/KTCCS.2022.11.10.363 인용 PDF KSCI

Search Result 21, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)