• Title/Summary/Keyword: Adversarial Detection

Search Result 93, Processing Time 0.02 seconds

Adversarial Detection with Gaussian Process Regression-based Detector

  • Lee, Sangheon;Kim, Noo-ri;Cho, Youngwha;Choi, Jae-Young;Kim, Suntae;Kim, Jeong-Ah;Lee, Jee-Hyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.4285-4299
    • /
    • 2019
  • Adversarial attack is a technique that causes a malfunction of classification models by adding noise that cannot be distinguished by humans, which poses a threat to a deep learning model. In this paper, we propose an efficient method to detect adversarial images using Gaussian process regression. Existing deep learning-based adversarial detection methods require numerous adversarial images for their training. The proposed method overcomes this problem by performing classification based on the statistical features of adversarial images and clean images that are extracted by Gaussian process regression with a small number of images. This technique can determine whether the input image is an adversarial image by applying Gaussian process regression based on the intermediate output value of the classification model. Experimental results show that the proposed method achieves higher detection performance than the other deep learning-based adversarial detection methods for powerful attacks. In particular, the Gaussian process regression-based detector shows better detection performance than the baseline models for most attacks in the case with fewer adversarial examples.

BM3D and Deep Image Prior based Denoising for the Defense against Adversarial Attacks on Malware Detection Networks

  • Sandra, Kumi;Lee, Suk-Ho
    • International journal of advanced smart convergence
    • /
    • v.10 no.3
    • /
    • pp.163-171
    • /
    • 2021
  • Recently, Machine Learning-based visualization approaches have been proposed to combat the problem of malware detection. Unfortunately, these techniques are exposed to Adversarial examples. Adversarial examples are noises which can deceive the deep learning based malware detection network such that the malware becomes unrecognizable. To address the shortcomings of these approaches, we present Block-matching and 3D filtering (BM3D) algorithm and deep image prior based denoising technique to defend against adversarial examples on visualization-based malware detection systems. The BM3D based denoising method eliminates most of the adversarial noise. After that the deep image prior based denoising removes the remaining subtle noise. Experimental results on the MS BIG malware dataset and benign samples show that the proposed denoising based defense recovers the performance of the adversarial attacked CNN model for malware detection to some extent.

Keyed learning: An adversarial learning framework-formalization, challenges, and anomaly detection applications

  • Bergadano, Francesco
    • ETRI Journal
    • /
    • v.41 no.5
    • /
    • pp.608-618
    • /
    • 2019
  • We propose a general framework for keyed learning, where a secret key is used as an additional input of an adversarial learning system. We also define models and formal challenges for an adversary who knows the learning algorithm and its input data but has no access to the key value. This adversarial learning framework is subsequently applied to a more specific context of anomaly detection, where the secret key finds additional practical uses and guides the entire learning and alarm-generating procedure.

A Study on Robustness Evaluation and Improvement of AI Model for Malware Variation Analysis (악성코드 변종 분석을 위한 AI 모델의 Robust 수준 측정 및 개선 연구)

  • Lee, Eun-gyu;Jeong, Si-on;Lee, Hyun-woo;Lee, Tea-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.997-1008
    • /
    • 2022
  • Today, AI(Artificial Intelligence) technology is being extensively researched in various fields, including the field of malware detection. To introduce AI systems into roles that protect important decisions and resources, it must be a reliable AI model. AI model that dependent on training dataset should be verified to be robust against new attacks. Rather than generating new malware detection, attackers find malware detection that succeed in attacking by mass-producing strains of previously detected malware detection. Most of the attacks, such as adversarial attacks, that lead to misclassification of AI models, are made by slightly modifying past attacks. Robust models that can be defended against these variants is needed, and the Robustness level of the model cannot be evaluated with accuracy and recall, which are widely used as AI evaluation indicators. In this paper, we experiment a framework to evaluate robustness level by generating an adversarial sample based on one of the adversarial attacks, C&W attack, and to improve robustness level through adversarial training. Through experiments based on malware dataset in this study, the limitations and possibilities of the proposed method in the field of malware detection were confirmed.

StarGAN-Based Detection and Purification Studies to Defend against Adversarial Attacks (적대적 공격을 방어하기 위한 StarGAN 기반의 탐지 및 정화 연구)

  • Sungjune Park;Gwonsang Ryu;Daeseon Choi
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.3
    • /
    • pp.449-458
    • /
    • 2023
  • Artificial Intelligence is providing convenience in various fields using big data and deep learning technologies. However, deep learning technology is highly vulnerable to adversarial examples, which can cause misclassification of classification models. This study proposes a method to detect and purification various adversarial attacks using StarGAN. The proposed method trains a StarGAN model with added Categorical Entropy loss using adversarial examples generated by various attack methods to enable the Discriminator to detect adversarial examples and the Generator to purification them. Experimental results using the CIFAR-10 dataset showed an average detection performance of approximately 68.77%, an average purification performance of approximately 72.20%, and an average defense performance of approximately 93.11% derived from restoration and detection performance.

Adversarial Attacks for Deep Learning-Based Infrared Object Detection (딥러닝 기반 적외선 객체 검출을 위한 적대적 공격 기술 연구)

  • Kim, Hoseong;Hyun, Jaeguk;Yoo, Hyunjung;Kim, Chunho;Jeon, Hyunho
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.24 no.6
    • /
    • pp.591-601
    • /
    • 2021
  • Recently, infrared object detection(IOD) has been extensively studied due to the rapid growth of deep neural networks(DNN). Adversarial attacks using imperceptible perturbation can dramatically deteriorate the performance of DNN. However, most adversarial attack works are focused on visible image recognition(VIR), and there are few methods for IOD. We propose deep learning-based adversarial attacks for IOD by expanding several state-of-the-art adversarial attacks for VIR. We effectively validate our claim through comprehensive experiments on two challenging IOD datasets, including FLIR and MSOD.

Adversarial Example Detection Based on Symbolic Representation of Image (이미지의 Symbolic Representation 기반 적대적 예제 탐지 방법)

  • Park, Sohee;Kim, Seungjoo;Yoon, Hayeon;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.975-986
    • /
    • 2022
  • Deep learning is attracting great attention, showing excellent performance in image processing, but is vulnerable to adversarial attacks that cause the model to misclassify through perturbation on input data. Adversarial examples generated by adversarial attacks are minimally perturbated where it is difficult to identify, so visual features of the images are not generally changed. Unlikely deep learning models, people are not fooled by adversarial examples, because they classify the images based on such visual features of images. This paper proposes adversarial attack detection method using Symbolic Representation, which is a visual and symbolic features such as color, shape of the image. We detect a adversarial examples by comparing the converted Symbolic Representation from the classification results for the input image and Symbolic Representation extracted from the input images. As a result of measuring performance on adversarial examples by various attack method, detection rates differed depending on attack targets and methods, but was up to 99.02% for specific target attack.

Generative Model of Acceleration Data for Deep Learning-based Damage Detection for Bridges Using Generative Adversarial Network (딥러닝 기반 교량 손상추정을 위한 Generative Adversarial Network를 이용한 가속도 데이터 생성 모델)

  • Lee, Kanghyeok;Shin, Do Hyoung
    • Journal of KIBIM
    • /
    • v.9 no.1
    • /
    • pp.42-51
    • /
    • 2019
  • Maintenance of aging structures has attracted societal attention. Maintenance of the aging structure can be efficiently performed with a digital twin. In order to maintain the structure based on the digital twin, it is required to accurately detect the damage of the structure. Meanwhile, deep learning-based damage detection approaches have shown good performance for detecting damage of structures. However, in order to develop such deep learning-based damage detection approaches, it is necessary to use a large number of data before and after damage, but there is a problem that the amount of data before and after the damage is unbalanced in reality. In order to solve this problem, this study proposed a method based on Generative adversarial network, one of Generative Model, for generating acceleration data usually used for damage detection approaches. As results, it is confirmed that the acceleration data generated by the GAN has a very similar pattern to the acceleration generated by the simulation with structural analysis software. These results show that not only the pattern of the macroscopic data but also the frequency domain of the acceleration data can be reproduced. Therefore, these findings show that the GAN model can analyze complex acceleration data on its own, and it is thought that this data can help training of the deep learning-based damage detection approaches.

Adversarial Example Detection and Classification Model Based on the Class Predicted by Deep Learning Model (데이터 예측 클래스 기반 적대적 공격 탐지 및 분류 모델)

  • Ko, Eun-na-rae;Moon, Jong-sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.6
    • /
    • pp.1227-1236
    • /
    • 2021
  • Adversarial attack, one of the attacks on deep learning classification model, is attack that add indistinguishable perturbations to input data and cause deep learning classification model to misclassify the input data. There are various adversarial attack algorithms. Accordingly, many studies have been conducted to detect adversarial attack but few studies have been conducted to classify what adversarial attack algorithms to generate adversarial input. if adversarial attacks can be classified, more robust deep learning classification model can be established by analyzing differences between attacks. In this paper, we proposed a model that detects and classifies adversarial attacks by constructing a random forest classification model with input features extracted from a target deep learning model. In feature extraction, feature is extracted from a output value of hidden layer based on class predicted by the target deep learning model. Through Experiments the model proposed has shown 3.02% accuracy on clean data, 0.80% accuracy on adversarial data higher than the result of pre-existing studies and classify new adversarial attack that was not classified in pre-existing studies.

Defending and Detecting Audio Adversarial Example using Frame Offsets

  • Gong, Yongkang;Yan, Diqun;Mao, Terui;Wang, Donghua;Wang, Rangding
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.4
    • /
    • pp.1538-1552
    • /
    • 2021
  • Machine learning models are vulnerable to adversarial examples generated by adding a deliberately designed perturbation to a benign sample. Particularly, for automatic speech recognition (ASR) system, a benign audio which sounds normal could be decoded as a harmful command due to potential adversarial attacks. In this paper, we focus on the countermeasures against audio adversarial examples. By analyzing the characteristics of ASR systems, we find that frame offsets with silence clip appended at the beginning of an audio can degenerate adversarial perturbations to normal noise. For various scenarios, we exploit frame offsets by different strategies such as defending, detecting and hybrid strategy. Compared with the previous methods, our proposed method can defense audio adversarial example in a simpler, more generic and efficient way. Evaluated on three state-of-the-arts adversarial attacks against different ASR systems respectively, the experimental results demonstrate that the proposed method can effectively improve the robustness of ASR systems.