• Title/Summary/Keyword: Attention network

Search Result 1,480, Processing Time 0.026 seconds

Indoor Localization based on Multiple Neural Networks (다중 인공신경망 기반의 실내 위치 추정 기법)

  • Sohn, Insoo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.21 no.4
    • /
    • pp.378-384
    • /
    • 2015
  • Indoor localization is becoming one of the most important technologies for smart mobile applications with different requirements from conventional outdoor location estimation algorithms. Fingerprinting location estimation techniques based on neural networks have gained increasing attention from academia due to their good generalization properties. In this paper, we propose a novel location estimation algorithm based on an ensemble of multiple neural networks. The neural network ensemble has drawn much attention in various areas where one neural network fails to resolve and classify the given data due to its' inaccuracy, incompleteness, and ambiguity. To the best of our knowledge, this work is the first to enhance the location estimation accuracy in indoor wireless environments based on a neural network ensemble using fingerprinting training data. To evaluate the effectiveness of our proposed location estimation method, we conduct the numerical experiments using the TGn channel model that was developed by the 802.11n task group for evaluating high capacity WLAN technologies in indoor environments with multiple transmit and multiple receive antennas. The numerical results show that the proposed method based on the NNE technique outperforms the conventional methods and achieves very accurate estimation results even in environments with a low number of APs.

A Novel Transfer Learning-Based Algorithm for Detecting Violence Images

  • Meng, Yuyan;Yuan, Deyu;Su, Shaofan;Ming, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1818-1832
    • /
    • 2022
  • Violence in the Internet era poses a new challenge to the current counter-riot work, and according to research and analysis, most of the violent incidents occurring are related to the dissemination of violence images. The use of the popular deep learning neural network to automatically analyze the massive amount of images on the Internet has become one of the important tools in the current counter-violence work. This paper focuses on the use of transfer learning techniques and the introduction of an attention mechanism to the residual network (ResNet) model for the classification and identification of violence images. Firstly, the feature elements of the violence images are identified and a targeted dataset is constructed; secondly, due to the small number of positive samples of violence images, pre-training and attention mechanisms are introduced to suggest improvements to the traditional residual network; finally, the improved model is trained and tested on the constructed dedicated dataset. The research results show that the improved network model can quickly and accurately identify violence images with an average accuracy rate of 92.20%, thus effectively reducing the cost of manual identification and providing decision support for combating rebel organization activities.

Improving Adversarial Robustness via Attention (Attention 기법에 기반한 적대적 공격의 강건성 향상 연구)

  • Jaeuk Kim;Myung Gyo Oh;Leo Hyun Park;Taekyoung Kwon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.4
    • /
    • pp.621-631
    • /
    • 2023
  • Adversarial training improves the robustness of deep neural networks for adversarial examples. However, the previous adversarial training method focuses only on the adversarial loss function, ignoring that even a small perturbation of the input layer causes a significant change in the hidden layer features. Consequently, the accuracy of a defended model is reduced for various untrained situations such as clean samples or other attack techniques. Therefore, an architectural perspective is necessary to improve feature representation power to solve this problem. In this paper, we apply an attention module that generates an attention map of an input image to a general model and performs PGD adversarial training upon the augmented model. In our experiments on the CIFAR-10 dataset, the attention augmented model showed higher accuracy than the general model regardless of the network structure. In particular, the robust accuracy of our approach was consistently higher for various attacks such as PGD, FGSM, and BIM and more powerful adversaries. By visualizing the attention map, we further confirmed that the attention module extracts features of the correct class even for adversarial examples.

Modified YOLOv4S based on Deep learning with Feature Fusion and Spatial Attention (특징 융합과 공간 강조를 적용한 딥러닝 기반의 개선된 YOLOv4S)

  • Hwang, Beom-Yeon;Lee, Sang-Hun;Lee, Seung-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.12
    • /
    • pp.31-37
    • /
    • 2021
  • In this paper proposed a feature fusion and spatial attention-based modified YOLOv4S for small and occluded detection. Conventional YOLOv4S is a lightweight network and lacks feature extraction capability compared to the method of the deep network. The proposed method first combines feature maps of different scales with feature fusion to enhance semantic and low-level information. In addition expanding the receptive field with dilated convolution, the detection accuracy for small and occluded objects was improved. Second by improving the conventional spatial information with spatial attention, the detection accuracy of objects classified and occluded between objects was improved. PASCAL VOC and COCO datasets were used for quantitative evaluation of the proposed method. The proposed method improved mAP by 2.7% in the PASCAL VOC dataset and 1.8% in the COCO dataset compared to the Conventional YOLOv4S.

Using similarity based image caption to aid visual question answering (유사도 기반 이미지 캡션을 이용한 시각질의응답 연구)

  • Kang, Joonseo;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.191-204
    • /
    • 2021
  • Visual Question Answering (VQA) and image captioning are tasks that require understanding of the features of images and linguistic features of text. Therefore, co-attention may be the key to both tasks, which can connect image and text. In this paper, we propose a model to achieve high performance for VQA by image caption generated using a pretrained standard transformer model based on MSCOCO dataset. Captions unrelated to the question can rather interfere with answering, so some captions similar to the question were selected to use based on a similarity to the question. In addition, stopwords in the caption could not affect or interfere with answering, so the experiment was conducted after removing stopwords. Experiments were conducted on VQA-v2 data to compare the proposed model with the deep modular co-attention network (MCAN) model, which showed good performance by using co-attention between images and text. As a result, the proposed model outperformed the MCAN model.

Semantic Role Labeling using Biaffine Average Attention Model (Biaffine Average Attention 모델을 이용한 의미역 결정)

  • Nam, Chung-Hyeon;Jang, Kyung-Sik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.662-667
    • /
    • 2022
  • Semantic role labeling task(SRL) is to extract predicate and arguments such as agent, patient, place, time. In the previously SRL task studies, a pipeline method extracting linguistic features of sentence has been proposed, but in this method, errors of each extraction work in the pipeline affect semantic role labeling performance. Therefore, methods using End-to-End neural network model have recently been proposed. In this paper, we propose a neural network model using the Biaffine Average Attention model for SRL task. The proposed model consists of a structure that can focus on the entire sentence information regardless of the distance between the predicate in the sentence and the arguments, instead of LSTM model that uses the surrounding information for prediction of a specific token proposed in the previous studies. For evaluation, we used F1 scores to compare two models based BERT model that proposed in existing studies using F1 scores, and found that 76.21% performance was higher than comparison models.

Occlusion Robust Military Vehicle Detection using Two-Stage Part Attention Networks (2단계 부분 어텐션 네트워크를 이용한 가려짐에 강인한 군용 차량 검출)

  • Cho, Sunyoung
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.25 no.4
    • /
    • pp.381-389
    • /
    • 2022
  • Detecting partially occluded objects is difficult due to the appearances and shapes of occluders are highly variable. These variabilities lead to challenges of localizing accurate bounding box or classifying objects with visible object parts. To address these problems, we propose a two-stage part-based attention approach for robust object detection under partial occlusion. First, our part attention network(PAN) captures the important object parts and then it is used to generate weighted object features. Based on the weighted features, the re-weighted object features are produced by our reinforced PAN(RPAN). Experiments are performed on our collected military vehicle dataset and synthetic occlusion dataset. Our method outperforms the baselines and demonstrates the robustness of detecting objects under partial occlusion.

Tobacco Retail License Recognition Based on Dual Attention Mechanism

  • Shan, Yuxiang;Ren, Qin;Wang, Cheng;Wang, Xiuhui
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.480-488
    • /
    • 2022
  • Images of tobacco retail licenses have complex unstructured characteristics, which is an urgent technical problem in the robot process automation of tobacco marketing. In this paper, a novel recognition approach using a double attention mechanism is presented to realize the automatic recognition and information extraction from such images. First, we utilized a DenseNet network to extract the license information from the input tobacco retail license data. Second, bi-directional long short-term memory was used for coding and decoding using a continuous decoder integrating dual attention to realize the recognition and information extraction of tobacco retail license images without segmentation. Finally, several performance experiments were conducted using a largescale dataset of tobacco retail licenses. The experimental results show that the proposed approach achieves a correction accuracy of 98.36% on the ZY-LQ dataset, outperforming most existing methods.

Single Image-based Depth Estimation Network using Attention Model (Attention Model 을 이용한 단안 영상 기반 깊이 추정 네트워크)

  • Jung, Geunho;Yoon, Sang Min
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.14-17
    • /
    • 2020
  • 단안 영상에서의 깊이 추정은 주어진 시점에서 촬영된 2 차원 영상으로부터 객체까지의 3 차원 거리 정보를 추정하는 것이다. 최근 딥러닝 기반으로 단안 RGB 영상에서 깊이 정보 추정에 유용한 특징 맵을 추출하고 이를 이용해서 깊이를 추정하는 모델들이 기존 방법들의 성능을 넘어서면서 관련된 연구가 활발히 진행되고 있다. 또한 Attention Model 과 같이 특정 특징 맵의 채널 혹은 공간을 강조하여 전체적인 네트워크의 성능을 개선하는 연구가 소개되었다. 본 논문에서는 깊이 정보 추정을 위해 사용되는 특징 맵을 강조하기 위해서 Attention Model 을 추가한 AutoEncoder 기반의 깊이 추정 네트워크를 제안하고 적용 부분에 따른 네트워크의 깊이 정보 추정 성능을 평가 및 분석한다.

  • PDF

DANet-CAM for Pest & Disease Classification (병해충 분류를 위한 DANet-CAM)

  • Hung, Nguyen Tri Chan;Kim, Young Un;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.295-296
    • /
    • 2022
  • 작물을 경작 해충과 질병은 오랫동안 주요 관심사였다. 농업에서 병해충을 탐지하기 위해 전통적인 방법을 사용하는 것은 더 이상 높은 효율성을 제공하지 않는다. 오늘날 과학과 인공 지능의 폭발적인 발달로 인해 농업분야의 연구원들은 병해충을 탐지하기 위해 딥 러닝을 적용하고 있다. 최근에 다양한 분야의 문제들을 해결하기 위해 수많은 모델들이 발표되었지만, 많은 병해충 진단 딥러닝을 사용한 방법들은 하드웨어 리소스를 낭비하고 실제 농장에서 사용하기 어렵다. 따라서 본 논문에서는 작물의 병해충을 분류하기 위해 Select Kernel Attention(SK Attention)을 Channel Attention Module 로 변경하여 Decoupling-and-Attention network (DANet)을 하드웨어 리소스 사용을 최소화한다.