• Title/Summary/Keyword: Attention Network

Search Result 1,472, Processing Time 0.024 seconds

'Neonadeuri' of 'Unripe' and 'Ripe': Science Learning as Heterogeneous Network ('설다'와 '익다'의 너나들이 -이종네트워크로서 과학학습-)

  • Joung, Yong Jae
    • Journal of The Korean Association For Science Education
    • /
    • v.40 no.6
    • /
    • pp.631-648
    • /
    • 2020
  • As an attempt to consider what to pay attention to in science learning, this study was conducted with the aim of discussing the meaning of science learning as a heterogeneous network. As a result of theoretical investigation, the characteristics of the heterogeneous network were described in three aspects: heterogeneous composition, existence by relations, and construction and change by translation. And it was discussed that science learning also has these characteristics of heterogeneous network. Relating to what to pay attention to in science learning, it was also discussed that science learning as a heterogeneous network requires us to pay attention to the elevation of things, the concept as a punctualized heterogeneous network, and the construction and expansion of heterogeneous network with neonadeuri of 'unripe' and 'ripe'. Finally, several suggestions for the science learning were given.

CAttNet: A Compound Attention Network for Depth Estimation of Light Field Images

  • Dingkang Hua;Qian Zhang;Wan Liao;Bin Wang;Tao Yan
    • Journal of Information Processing Systems
    • /
    • v.19 no.4
    • /
    • pp.483-497
    • /
    • 2023
  • Depth estimation is one of the most complicated and difficult problems to deal with in the light field. In this paper, a compound attention convolutional neural network (CAttNet) is proposed to extract depth maps from light field images. To make more effective use of the sub-aperture images (SAIs) of light field and reduce the redundancy in SAIs, we use a compound attention mechanism to weigh the channel and space of the feature map after extracting the primary features, so it can more efficiently select the required view and the important area within the view. We modified various layers of feature extraction to make it more efficient and useful to extract features without adding parameters. By exploring the characteristics of light field, we increased the network depth and optimized the network structure to reduce the adverse impact of this change. CAttNet can efficiently utilize different SAIs correlations and features to generate a high-quality light field depth map. The experimental results show that CAttNet has advantages in both accuracy and time.

A Study on Lane Detection Based on Split-Attention Backbone Network (Split-Attention 백본 네트워크를 활용한 차선 인식에 관한 연구)

  • Song, In seo;Lee, Seon woo;Kwon, Jang woo;Won, Jong hoon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.5
    • /
    • pp.178-188
    • /
    • 2020
  • This paper proposes a lane recognition CNN network using split-attention network as a backbone to extract feature. Split-attention is a method of assigning weight to each channel of a feature map in the CNN feature extraction process; it can reliably extract the features of an image during the rapidly changing driving environment of a vehicle. The proposed deep neural networks in this paper were trained and evaluated using the Tusimple data set. The change in performance according to the number of layers of the backbone network was compared and analyzed. A result comparable to the latest research was obtained with an accuracy of up to 96.26, and FN showed the best result. Therefore, even in the driving environment of an actual vehicle, stable lane recognition is possible without misrecognition using the model proposed in this study.

Attention-based CNN-BiGRU for Bengali Music Emotion Classification

  • Subhasish Ghosh;Omar Faruk Riad
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.9
    • /
    • pp.47-54
    • /
    • 2023
  • For Bengali music emotion classification, deep learning models, particularly CNN and RNN are frequently used. But previous researches had the flaws of low accuracy and overfitting problem. In this research, attention-based Conv1D and BiGRU model is designed for music emotion classification and comparative experimentation shows that the proposed model is classifying emotions more accurate. We have proposed a Conv1D and Bi-GRU with the attention-based model for emotion classification of our Bengali music dataset. The model integrates attention-based. Wav preprocessing makes use of MFCCs. To reduce the dimensionality of the feature space, contextual features were extracted from two Conv1D layers. In order to solve the overfitting problems, dropouts are utilized. Two bidirectional GRUs networks are used to update previous and future emotion representation of the output from the Conv1D layers. Two BiGRU layers are conntected to an attention mechanism to give various MFCC feature vectors more attention. Moreover, the attention mechanism has increased the accuracy of the proposed classification model. The vector is finally classified into four emotion classes: Angry, Happy, Relax, Sad; using a dense, fully connected layer with softmax activation. The proposed Conv1D+BiGRU+Attention model is efficient at classifying emotions in the Bengali music dataset than baseline methods. For our Bengali music dataset, the performance of our proposed model is 95%.

Multi-level Cross-attention Siamese Network For Visual Object Tracking

  • Zhang, Jianwei;Wang, Jingchao;Zhang, Huanlong;Miao, Mengen;Cai, Zengyu;Chen, Fuguo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3976-3990
    • /
    • 2022
  • Currently, cross-attention is widely used in Siamese trackers to replace traditional correlation operations for feature fusion between template and search region. The former can establish a similar relationship between the target and the search region better than the latter for robust visual object tracking. But existing trackers using cross-attention only focus on rich semantic information of high-level features, while ignoring the appearance information contained in low-level features, which makes trackers vulnerable to interference from similar objects. In this paper, we propose a Multi-level Cross-attention Siamese network(MCSiam) to aggregate the semantic information and appearance information at the same time. Specifically, a multi-level cross-attention module is designed to fuse the multi-layer features extracted from the backbone, which integrate different levels of the template and search region features, so that the rich appearance information and semantic information can be used to carry out the tracking task simultaneously. In addition, before cross-attention, a target-aware module is introduced to enhance the target feature and alleviate interference, which makes the multi-level cross-attention module more efficient to fuse the information of the target and the search region. We test the MCSiam on four tracking benchmarks and the result show that the proposed tracker achieves comparable performance to the state-of-the-art trackers.

The Latest Trends in Attention Mechanisms and Their Application in Medical Imaging (어텐션 기법 및 의료 영상에의 적용에 관한 최신 동향)

  • Hyungseob Shin;Jeongryong Lee;Taejoon Eo;Yohan Jun;Sewon Kim;Dosik Hwang
    • Journal of the Korean Society of Radiology
    • /
    • v.81 no.6
    • /
    • pp.1305-1333
    • /
    • 2020
  • Deep learning has recently achieved remarkable results in the field of medical imaging. However, as a deep learning network becomes deeper to improve its performance, it becomes more difficult to interpret the processes within. This can especially be a critical problem in medical fields where diagnostic decisions are directly related to a patient's survival. In order to solve this, explainable artificial intelligence techniques are being widely studied, and an attention mechanism was developed as part of this approach. In this paper, attention techniques are divided into two types: post hoc attention, which aims to analyze a network that has already been trained, and trainable attention, which further improves network performance. Detailed comparisons of each method, examples of applications in medical imaging, and future perspectives will be covered.

A Study on the Classification of Fault Motors using Sound Data (소리 데이터를 이용한 불량 모터 분류에 관한 연구)

  • Il-Sik, Chang;Gooman, Park
    • Journal of Broadcast Engineering
    • /
    • v.27 no.6
    • /
    • pp.885-896
    • /
    • 2022
  • Motor failure in manufacturing plays an important role in future A/S and reliability. Motor failure is detected by measuring sound, current, and vibration. For the data used in this paper, the sound of the car's side mirror motor gear box was used. Motor sound consists of three classes. Sound data is input to the network model through a conversion process through MelSpectrogram. In this paper, various methods were applied, such as data augmentation to improve the performance of classifying fault motors and various methods according to class imbalance were applied resampling, reweighting adjustment, change of loss function and representation learning and classification into two stages. In addition, the curriculum learning method and self-space learning method were compared through a total of five network models such as Bidirectional LSTM Attention, Convolutional Recurrent Neural Network, Multi-Head Attention, Bidirectional Temporal Convolution Network, and Convolution Neural Network, and the optimal configuration was found for motor sound classification.

Extraction and classification of tempo stimuli from electroencephalography recordings using convolutional recurrent attention model

  • Lee, Gi Yong;Kim, Min-Soo;Kim, Hyoung-Gook
    • ETRI Journal
    • /
    • v.43 no.6
    • /
    • pp.1081-1092
    • /
    • 2021
  • Electroencephalography (EEG) recordings taken during the perception of music tempo contain information that estimates the tempo of a music piece. If information about this tempo stimulus in EEG recordings can be extracted and classified, it can be effectively used to construct a music-based brain-computer interface. This study proposes a novel convolutional recurrent attention model (CRAM) to extract and classify features corresponding to tempo stimuli from EEG recordings of listeners who listened with concentration to the tempo of musics. The proposed CRAM is composed of six modules, namely, network inputs, two-dimensional convolutional bidirectional gated recurrent unit-based sample encoder, sample-level intuitive attention, segment encoder, segment-level intuitive attention, and softmax layer, to effectively model spatiotemporal features and improve the classification accuracy of tempo stimuli. To evaluate the proposed method's performance, we conducted experiments on two benchmark datasets. The proposed method achieves promising results, outperforming recent methods.

In-depth Recommendation Model Based on Self-Attention Factorization

  • Hongshuang Ma;Qicheng Liu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.721-739
    • /
    • 2023
  • Rating prediction is an important issue in recommender systems, and its accuracy affects the experience of the user and the revenue of the company. Traditional recommender systems use Factorization Machinesfor rating predictions and each feature is selected with the same weight. Thus, there are problems with inaccurate ratings and limited data representation. This study proposes a deep recommendation model based on self-attention Factorization (SAFMR) to solve these problems. This model uses Convolutional Neural Networks to extract features from user and item reviews. The obtained features are fed into self-attention mechanism Factorization Machines, where the self-attention network automatically learns the dependencies of the features and distinguishes the weights of the different features, thereby reducing the prediction error. The model was experimentally evaluated using six classes of dataset. We compared MSE, NDCG and time for several real datasets. The experiment demonstrated that the SAFMR model achieved excellent rating prediction results and recommendation correlations, thereby verifying the effectiveness of the model.

3D Dual-Fusion Attention Network for Brain Tumor Segmentation (뇌종양 분할을 위한 3D 이중 융합 주의 네트워크)

  • Hoang-Son Vo-Thanh;Tram-Tran Nguyen Quynh;Nhu-Tai Do;Soo-Hyung Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.496-498
    • /
    • 2023
  • Brain tumor segmentation problem has challenges in the tumor diversity of location, imbalance, and morphology. Attention mechanisms have recently been used widely to tackle medical segmentation problems efficiently by focusing on essential regions. In contrast, the fusion approaches enhance performance by merging mutual benefits from many models. In this study, we proposed a 3D dual fusion attention network to combine the advantages of fusion approaches and attention mechanisms by residual self-attention and local blocks. Compared to fusion approaches and related works, our proposed method has shown promising results on the BraTS 2018 dataset.