• Title/Summary/Keyword: Attention Network

Search Result 1,472, Processing Time 0.028 seconds

An Attention Method-based Deep Learning Encoder for the Sentiment Classification of Documents (문서의 감정 분류를 위한 주목 방법 기반의 딥러닝 인코더)

  • Kwon, Sunjae;Kim, Juae;Kang, Sangwoo;Seo, Jungyun
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.268-273
    • /
    • 2017
  • Recently, deep learning encoder-based approach has been actively applied in the field of sentiment classification. However, Long Short-Term Memory network deep learning encoder, the commonly used architecture, lacks the quality of vector representation when the length of the documents is prolonged. In this study, for effective classification of the sentiment documents, we suggest the use of attention method-based deep learning encoder that generates document vector representation by weighted sum of the outputs of Long Short-Term Memory network based on importance. In addition, we propose methods to modify the attention method-based deep learning encoder to suit the sentiment classification field, which consist of a part that is to applied to window attention method and an attention weight adjustment part. In the window attention method part, the weights are obtained in the window units to effectively recognize feeling features that consist of more than one word. In the attention weight adjustment part, the learned weights are smoothened. Experimental results revealed that the performance of the proposed method outperformed Long Short-Term Memory network encoder, showing 89.67% in accuracy criteria.

Application of YOLOv5 Neural Network Based on Improved Attention Mechanism in Recognition of Thangka Image Defects

  • Fan, Yao;Li, Yubo;Shi, Yingnan;Wang, Shuaishuai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.245-265
    • /
    • 2022
  • In response to problems such as insufficient extraction information, low detection accuracy, and frequent misdetection in the field of Thangka image defects, this paper proposes a YOLOv5 prediction algorithm fused with the attention mechanism. Firstly, the Backbone network is used for feature extraction, and the attention mechanism is fused to represent different features, so that the network can fully extract the texture and semantic features of the defect area. The extracted features are then weighted and fused, so as to reduce the loss of information. Next, the weighted fused features are transferred to the Neck network, the semantic features and texture features of different layers are fused by FPN, and the defect target is located more accurately by PAN. In the detection network, the CIOU loss function is used to replace the GIOU loss function to locate the image defect area quickly and accurately, generate the bounding box, and predict the defect category. The results show that compared with the original network, YOLOv5-SE and YOLOv5-CBAM achieve an improvement of 8.95% and 12.87% in detection accuracy respectively. The improved networks can identify the location and category of defects more accurately, and greatly improve the accuracy of defect detection of Thangka images.

Object Detection Model Using Attention Mechanism (주의 집중 기법을 활용한 객체 검출 모델)

  • Kim, Geun-Sik;Bae, Jung-Soo;Cha, Eui-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1581-1587
    • /
    • 2020
  • With the emergence of convolutional neural network in the field of machine learning, the model for solving image processing problems has seen rapid development. However, the computing resources required are also rising, making it difficult to learn from a typical environment. Attention mechanism is originally proposed to prevent the gradient vanishing problem of the recurrent neural network, but this can also be used in a direction favorable to learning of the convolutional neural network. In this paper, attention mechanism is applied to convolutional neural network, and the excellence of the proposed method is demonstrated through the comparison of learning time and performance difference at this time. The proposed model showed that both learning time and performance were superior in object detection based on YOLO compared to models without attention mechanism, and experimentally demonstrated that learning time could be significantly reduced. In addition, this is expected to increase accessibility to machine learning by end users.

Unsupervised Monocular Depth Estimation Using Self-Attention for Autonomous Driving (자율주행을 위한 Self-Attention 기반 비지도 단안 카메라 영상 깊이 추정)

  • Seung-Jun Hwang;Sung-Jun Park;Joong-Hwan Baek
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.2
    • /
    • pp.182-189
    • /
    • 2023
  • Depth estimation is a key technology in 3D map generation for autonomous driving of vehicles, robots, and drones. The existing sensor-based method has high accuracy but is expensive and has low resolution, while the camera-based method is more affordable with higher resolution. In this study, we propose self-attention-based unsupervised monocular depth estimation for UAV camera system. Self-Attention operation is applied to the network to improve the global feature extraction performance. In addition, we reduce the weight size of the self-attention operation for a low computational amount. The estimated depth and camera pose are transformed into point cloud. The point cloud is mapped into 3D map using the occupancy grid of Octree structure. The proposed network is evaluated using synthesized images and depth sequences from the Mid-Air dataset. Our network demonstrates a 7.69% reduction in error compared to prior studies.

Attention Deep Neural Networks Learning based on Multiple Loss functions for Video Face Recognition (비디오 얼굴인식을 위한 다중 손실 함수 기반 어텐션 심층신경망 학습 제안)

  • Kim, Kyeong Tae;You, Wonsang;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.10
    • /
    • pp.1380-1390
    • /
    • 2021
  • The video face recognition (FR) is one of the most popular researches in the field of computer vision due to a variety of applications. In particular, research using the attention mechanism is being actively conducted. In video face recognition, attention represents where to focus on by using the input value of the whole or a specific region, or which frame to focus on when there are many frames. In this paper, we propose a novel attention based deep learning method. Main novelties of our method are (1) the use of combining two loss functions, namely weighted Softmax loss function and a Triplet loss function and (2) the feasibility of end-to-end learning which includes the feature embedding network and attention weight computation. The feature embedding network has a positive effect on the attention weight computation by using combined loss function and end-to-end learning. To demonstrate the effectiveness of our proposed method, extensive and comparative experiments have been carried out to evaluate our method on IJB-A dataset with their standard evaluation protocols. Our proposed method represented better or comparable recognition rate compared to other state-of-the-art video FR methods.

A Study on Lightweight Model with Attention Process for Efficient Object Detection (효율적인 객체 검출을 위해 Attention Process를 적용한 경량화 모델에 대한 연구)

  • Park, Chan-Soo;Lee, Sang-Hun;Han, Hyun-Ho
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.307-313
    • /
    • 2021
  • In this paper, a lightweight network with fewer parameters compared to the existing object detection method is proposed. In the case of the currently used detection model, the network complexity has been greatly increased to improve accuracy. Therefore, the proposed network uses EfficientNet as a feature extraction network, and the subsequent layers are formed in a pyramid structure to utilize low-level detailed features and high-level semantic features. An attention process was applied between pyramid structures to suppress unnecessary noise for prediction. All computational processes of the network are replaced by depth-wise and point-wise convolutions to minimize the amount of computation. The proposed network was trained and evaluated using the PASCAL VOC dataset. The features fused through the experiment showed robust properties for various objects through a refinement process. Compared with the CNN-based detection model, detection accuracy is improved with a small amount of computation. It is considered necessary to adjust the anchor ratio according to the size of the object as a future study.

Attention-LSTM based Lane Change Possibility Decision Algorithm for Urban Autonomous Driving (도심 자율주행을 위한 어텐션-장단기 기억 신경망 기반 차선 변경 가능성 판단 알고리즘 개발)

  • Lee, Heeseong;Yi, Kyongsu
    • Journal of Auto-vehicle Safety Association
    • /
    • v.14 no.3
    • /
    • pp.65-70
    • /
    • 2022
  • Lane change in urban environments is a challenge for both human-driving and automated driving due to their complexity and non-linearity. With the recent development of deep-learning, the use of the RNN network, which uses time series data, has become the mainstream in this field. Many researches using RNN show high accuracy in highway environments, but still do not for urban environments where the surrounding situation is complex and rapidly changing. Therefore, this paper proposes a lane change possibility decision network by adopting Attention layer, which is an SOTA in the field of seq2seq. By weighting each time step within a given time horizon, the context of the road situation is more human-like. A total 7D vectors of x, y distances and longitudinal relative speed of side front and rear vehicles, and longitudinal speed of ego vehicle were used as input. A total 5,614 expert data of 4,098 yield cases and 1,516 non-yield cases were used for training, and the performance of this network was tested through 1,817 data. Our network achieves 99.641% of test accuracy, which is about 4% higher than a network using only LSTM in an urban environment. Furthermore, it shows robust behavior to false-positive or true-negative objects.

A Dual-scale Network with Spatial-temporal Attention for 12-lead ECG Classification

  • Shuo Xiao;Yiting Xu;Chaogang Tang;Zhenzhen Huang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2361-2376
    • /
    • 2023
  • The electrocardiogram (ECG) signal is commonly used to screen and diagnose cardiovascular diseases. In recent years, deep neural networks have been regarded as an effective way for automatic ECG disease diagnosis. The convolutional neural network is widely used for ECG signal extraction because it can obtain different levels of information. However, most previous studies adopt single scale convolution filters to extract ECG signal features, ignoring the complementarity between ECG signal features of different scales. In the paper, we propose a dual-scale network with convolution filters of different sizes for 12-lead ECG classification. Our model can extract and fuse ECG signal features of different scales. In addition, different spatial and time periods of the feature map obtained from the 12-lead ECG may have different contributions to ECG classification. Therefore, we add a spatial-temporal attention to each scale sub-network to emphasize the representative local spatial and temporal features. Our approach is evaluated on PTB-XL dataset and achieves 0.9307, 0.8152, and 89.11 on macro-averaged ROC-AUC score, a maximum F1 score, and mean accuracy, respectively. The experiment results have proven that our approach outperforms the baselines.

A Deep Learning-Based Image Semantic Segmentation Algorithm

  • Chaoqun, Shen;Zhongliang, Sun
    • Journal of Information Processing Systems
    • /
    • v.19 no.1
    • /
    • pp.98-108
    • /
    • 2023
  • This paper is an attempt to design segmentation method based on fully convolutional networks (FCN) and attention mechanism. The first five layers of the Visual Geometry Group (VGG) 16 network serve as the coding part in the semantic segmentation network structure with the convolutional layer used to replace pooling to reduce loss of image feature extraction information. The up-sampling and deconvolution unit of the FCN is then used as the decoding part in the semantic segmentation network. In the deconvolution process, the skip structure is used to fuse different levels of information and the attention mechanism is incorporated to reduce accuracy loss. Finally, the segmentation results are obtained through pixel layer classification. The results show that our method outperforms the comparison methods in mean pixel accuracy (MPA) and mean intersection over union (MIOU).

Spatial-temporal attention network-based POI recommendation through graph learning (그래프 학습을 통한 시공간 Attention Network 기반 POI 추천)

  • Cao, Gang;Joe, Inwhee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.399-401
    • /
    • 2022
  • POI (Point-of-Interest) 추천은 다양한 위치 기반 서비스에서 중요한 역할을 있다. 기존 연구에서는 사용자의 모바일 선호도를 모델링하기 위해 과거의 체크인의 공간-시간적 관계를 추출한다. 그러나 사용자 궤적에 숨겨진 개인 방문 경향을 반영할 수 있는 structured feature 는 잘 활용되지 않는다. 이 논문에서는 궤적 그래프를 결합한 시공간 인식 attention 네트워크를 제안한다. 개인의 선호도가 시간이 지남에 따라 변할 수 있다는 점을 고려하면 Dynamic GCN (Graph Convolution Network) 모듈은 POI 들의 공간적 상관관계를 동적으로 집계할 수 있다. LBSN (Location-Based Social Networks) 데이터 세트에서 검증된 새 모델은 기존 모델보다 약 9.0% 성능이 뛰어나다.