• 제목/요약/키워드: Spatial attention module

검색결과 19건 처리시간 0.028초

Crack detection based on ResNet with spatial attention

  • Yang, Qiaoning;Jiang, Si;Chen, Juan;Lin, Weiguo
    • Computers and Concrete
    • /
    • 제26권5호
    • /
    • pp.411-420
    • /
    • 2020
  • Deep Convolution neural network (DCNN) has been widely used in the healthy maintenance of civil infrastructure. Using DCNN to improve crack detection performance has attracted many researchers' attention. In this paper, a light-weight spatial attention network module is proposed to strengthen the representation capability of ResNet and improve the crack detection performance. It utilizes attention mechanism to strengthen the interested objects in global receptive field of ResNet convolution layers. Global average spatial information over all channels are used to construct an attention scalar. The scalar is combined with adaptive weighted sigmoid function to activate the output of each channel's feature maps. Salient objects in feature maps are refined by the attention scalar. The proposed spatial attention module is stacked in ResNet50 to detect crack. Experiments results show that the proposed module can got significant performance improvement in crack detection.

초고해상도 복원에서 성능 향상을 위한 다양한 Attention 연구 (A Study on Various Attention for Improving Performance in Single Image Super Resolution)

  • 문환복;윤상민
    • 방송공학회논문지
    • /
    • 제25권6호
    • /
    • pp.898-910
    • /
    • 2020
  • 컴퓨터 비전에서 단일 영상 기반의 초고해상도 영상 복원의 중요성과 확장성으로 관련 분야에서 많은 연구가 진행되어 왔으며, 최근 딥러닝에 대한 관심이 증가하면서 딥러닝을 활용한 단안 영상 기반 초고해상도 연구가 활발히 진행되고 있다. 대부분의 딥러닝을 기반으로 하는 단안 영상 기반 초고해상도 복원 연구는 복원 성능을 향상시키기 위해 네트워크의 구조, 손실 함수, 학습 방법에 초점이 맞추어 연구가 진행되었다. 한편, 딥러닝 네트워크를 깊게 쌓지 않고 초고해상도 영상 복원 성능을 향상시키기 위해 추출된 특징 맵을 강조하는 Attention Module에 대한 연구가 다양한 분야에 적용되어 왔다. Attention Module은 다양한 관점에서 네트워크의 목적에 맞는 특징 정보를 강조 및 스케일링 한다. 본 논문에서는 초고해상도 복원 네트워크를 기반으로 다양한 구조의 Channel Attention과 Spatial Attention을 설계하고, 다양한 관점에서 특징 맵을 강조하기 위해 다중 Attention Module 구조를 설계하여 성능을 분석 및 비교한다.

Boundary-Aware Dual Attention Guided Liver Segment Segmentation Model

  • Jia, Xibin;Qian, Chen;Yang, Zhenghan;Xu, Hui;Han, Xianjun;Ren, Hao;Wu, Xinru;Ma, Boyang;Yang, Dawei;Min, Hong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권1호
    • /
    • pp.16-37
    • /
    • 2022
  • Accurate liver segment segmentation based on radiological images is indispensable for the preoperative analysis of liver tumor resection surgery. However, most of the existing segmentation methods are not feasible to be used directly for this task due to the challenge of exact edge prediction with some tiny and slender vessels as its clinical segmentation criterion. To address this problem, we propose a novel deep learning based segmentation model, called Boundary-Aware Dual Attention Liver Segment Segmentation Model (BADA). This model can improve the segmentation accuracy of liver segments with enhancing the edges including the vessels serving as segment boundaries. In our model, the dual gated attention is proposed, which composes of a spatial attention module and a semantic attention module. The spatial attention module enhances the weights of key edge regions by concerning about the salient intensity changes, while the semantic attention amplifies the contribution of filters that can extract more discriminative feature information by weighting the significant convolution channels. Simultaneously, we build a dataset of liver segments including 59 clinic cases with dynamically contrast enhanced MRI(Magnetic Resonance Imaging) of portal vein stage, which annotated by several professional radiologists. Comparing with several state-of-the-art methods and baseline segmentation methods, we achieve the best results on this clinic liver segment segmentation dataset, where Mean Dice, Mean Sensitivity and Mean Positive Predicted Value reach 89.01%, 87.71% and 90.67%, respectively.

Skin Lesion Segmentation with Codec Structure Based Upper and Lower Layer Feature Fusion Mechanism

  • Yang, Cheng;Lu, GuanMing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권1호
    • /
    • pp.60-79
    • /
    • 2022
  • The U-Net architecture-based segmentation models attained remarkable performance in numerous medical image segmentation missions like skin lesion segmentation. Nevertheless, the resolution gradually decreases and the loss of spatial information increases with deeper network. The fusion of adjacent layers is not enough to make up for the lost spatial information, thus resulting in errors of segmentation boundary so as to decline the accuracy of segmentation. To tackle the issue, we propose a new deep learning-based segmentation model. In the decoding stage, the feature channels of each decoding unit are concatenated with all the feature channels of the upper coding unit. Which is done in order to ensure the segmentation effect by integrating spatial and semantic information, and promotes the robustness and generalization of our model by combining the atrous spatial pyramid pooling (ASPP) module and channel attention module (CAM). Extensive experiments on ISIC2016 and ISIC2017 common datasets proved that our model implements well and outperforms compared segmentation models for skin lesion segmentation.

Dual Attention Based Image Pyramid Network for Object Detection

  • Dong, Xiang;Li, Feng;Bai, Huihui;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권12호
    • /
    • pp.4439-4455
    • /
    • 2021
  • Compared with two-stage object detection algorithms, one-stage algorithms provide a better trade-off between real-time performance and accuracy. However, these methods treat the intermediate features equally, which lacks the flexibility to emphasize meaningful information for classification and location. Besides, they ignore the interaction of contextual information from different scales, which is important for medium and small objects detection. To tackle these problems, we propose an image pyramid network based on dual attention mechanism (DAIPNet), which builds an image pyramid to enrich the spatial information while emphasizing multi-scale informative features based on dual attention mechanisms for one-stage object detection. Our framework utilizes a pre-trained backbone as standard detection network, where the designed image pyramid network (IPN) is used as auxiliary network to provide complementary information. Here, the dual attention mechanism is composed of the adaptive feature fusion module (AFFM) and the progressive attention fusion module (PAFM). AFFM is designed to automatically pay attention to the feature maps with different importance from the backbone and auxiliary network, while PAFM is utilized to adaptively learn the channel attentive information in the context transfer process. Furthermore, in the IPN, we build an image pyramid to extract scale-wise features from downsampled images of different scales, where the features are further fused at different states to enrich scale-wise information and learn more comprehensive feature representations. Experimental results are shown on MS COCO dataset. Our proposed detector with a 300 × 300 input achieves superior performance of 32.6% mAP on the MS COCO test-dev compared with state-of-the-art methods.

딥러닝을 이용한 실시간 말벌 분류 시스템 (Real Time Hornet Classification System Based on Deep Learning)

  • 정윤주;이영학;이스라필 안사리;이철희
    • 전기전자학회논문지
    • /
    • 제24권4호
    • /
    • pp.1141-1147
    • /
    • 2020
  • 말벌 종은 모양이 매우 유사하기 때문에 비전문가가 분류하기 어렵고, 객체의 크기가 작고 빠르게 움직이기 때문에 실시간으로 탐지하여 종을 분류하는 것은 더욱 어렵다. 본 논문에서는 바운딩 박스를 이용한 딥러닝 알고리즘을 기반으로 말벌 종을 실시간으로 분류하는 시스템을 개발하였다. 훈련 영상의 레이블링 작업 시 바운딩 박스 안에 포함되는 배경 영역을 최소화하기 위하여 말벌의 머리와 몸통 부분만을 선택하는 방법을 제안한다. 또한 실시간으로 말벌을 탐지하고 그 종을 분류할 수 있는 최선의 알고리즘을 찾기 위하여 기존의 바운딩 박스 기반 객체 인식 알고리즘들을 실험을 통하여 비교한다. 실험 결과 컨볼루션 레이어의 활성함수로 mish 함수를 적용하고, 객체 검출 블록 전에 공간집중모듈(Spatial Attention Module, SAM)을 적용한 YOLOv4 모델을 사용하여 말벌 영상을 테스트한 경우 평균 97.89%의 정밀도(Precision)와 98.69%의 재현율(Recall)을 나타내었다.

EDMFEN: Edge detection-based multi-scale feature enhancement Network for low-light image enhancement

  • Canlin Li;Shun Song;Pengcheng Gao;Wei Huang;Lihua Bi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권4호
    • /
    • pp.980-997
    • /
    • 2024
  • To improve the brightness of images and reveal hidden information in dark areas is the main objective of low-light image enhancement (LLIE). LLIE methods based on deep learning show good performance. However, there are some limitations to these methods, such as the complex network model requires highly configurable environments, and deficient enhancement of edge details leads to blurring of the target content. Single-scale feature extraction results in the insufficient recovery of the hidden content of the enhanced images. This paper proposed an edge detection-based multi-scale feature enhancement network for LLIE (EDMFEN). To reduce the loss of edge details in the enhanced images, an edge extraction module consisting of a Sobel operator is introduced to obtain edge information by computing gradients of images. In addition, a multi-scale feature enhancement module (MSFEM) consisting of multi-scale feature extraction block (MSFEB) and a spatial attention mechanism is proposed to thoroughly recover the hidden content of the enhanced images and obtain richer features. Since the fused features may contain some useless information, the MSFEB is introduced so as to obtain the image features with different perceptual fields. To use the multi-scale features more effectively, a spatial attention mechanism module is used to retain the key features and improve the model performance after fusing multi-scale features. Experimental results on two datasets and five baseline datasets show that EDMFEN has good performance when compared with the stateof-the-art LLIE methods.

CT 영상에서 폐 결절 분할을 위한 경계 및 역 어텐션 기법 (Boundary and Reverse Attention Module for Lung Nodule Segmentation in CT Images)

  • 황경연;지예원;윤학영;이상준
    • 대한임베디드공학회논문지
    • /
    • 제17권5호
    • /
    • pp.265-272
    • /
    • 2022
  • As the risk of lung cancer has increased, early-stage detection and treatment of cancers have received a lot of attention. Among various medical imaging approaches, computer tomography (CT) has been widely utilized to examine the size and growth rate of lung nodules. However, the process of manual examination is a time-consuming task, and it causes physical and mental fatigue for medical professionals. Recently, many computer-aided diagnostic methods have been proposed to reduce the workload of medical professionals. In recent studies, encoder-decoder architectures have shown reliable performances in medical image segmentation, and it is adopted to predict lesion candidates. However, localizing nodules in lung CT images is a challenging problem due to the extremely small sizes and unstructured shapes of nodules. To solve these problems, we utilize atrous spatial pyramid pooling (ASPP) to minimize the loss of information for a general U-Net baseline model to extract rich representations from various receptive fields. Moreover, we propose mixed-up attention mechanism of reverse, boundary and convolutional block attention module (CBAM) to improve the accuracy of segmentation small scale of various shapes. The performance of the proposed model is compared with several previous attention mechanisms on the LIDC-IDRI dataset, and experimental results demonstrate that reverse, boundary, and CBAM (RB-CBAM) are effective in the segmentation of small nodules.

GAN-Based Local Lightness-Aware Enhancement Network for Underexposed Images

  • Chen, Yong;Huang, Meiyong;Liu, Huanlin;Zhang, Jinliang;Shao, Kaixin
    • Journal of Information Processing Systems
    • /
    • 제18권4호
    • /
    • pp.575-586
    • /
    • 2022
  • Uneven light in real-world causes visual degradation for underexposed regions. For these regions, insufficient consideration during enhancement procedure will result in over-/under-exposure, loss of details and color distortion. Confronting such challenges, an unsupervised low-light image enhancement network is proposed in this paper based on the guidance of the unpaired low-/normal-light images. The key components in our network include super-resolution module (SRM), a GAN-based low-light image enhancement network (LLIEN), and denoising-scaling module (DSM). The SRM improves the resolution of the low-light input images before illumination enhancement. Such design philosophy improves the effectiveness of texture details preservation by operating in high-resolution space. Subsequently, local lightness attention module in LLIEN effectively distinguishes unevenly illuminated areas and puts emphasis on low-light areas, ensuring the spatial consistency of illumination for locally underexposed images. Then, multiple discriminators, i.e., global discriminator, local region discriminator, and color discriminator performs assessment from different perspectives to avoid over-/under-exposure and color distortion, which guides the network to generate images that in line with human aesthetic perception. Finally, the DSM performs noise removal and obtains high-quality enhanced images. Both qualitative and quantitative experiments demonstrate that our approach achieves favorable results, which indicates its superior capacity on illumination and texture details restoration.

AANet: Adjacency auxiliary network for salient object detection

  • Li, Xialu;Cui, Ziguan;Gan, Zongliang;Tang, Guijin;Liu, Feng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권10호
    • /
    • pp.3729-3749
    • /
    • 2021
  • At present, deep convolution network-based salient object detection (SOD) has achieved impressive performance. However, it is still a challenging problem to make full use of the multi-scale information of the extracted features and which appropriate feature fusion method is adopted to process feature mapping. In this paper, we propose a new adjacency auxiliary network (AANet) based on multi-scale feature fusion for SOD. Firstly, we design the parallel connection feature enhancement module (PFEM) for each layer of feature extraction, which improves the feature density by connecting different dilated convolution branches in parallel, and add channel attention flow to fully extract the context information of features. Then the adjacent layer features with close degree of abstraction but different characteristic properties are fused through the adjacent auxiliary module (AAM) to eliminate the ambiguity and noise of the features. Besides, in order to refine the features effectively to get more accurate object boundaries, we design adjacency decoder (AAM_D) based on adjacency auxiliary module (AAM), which concatenates the features of adjacent layers, extracts their spatial attention, and then combines them with the output of AAM. The outputs of AAM_D features with semantic information and spatial detail obtained from each feature are used as salient prediction maps for multi-level feature joint supervising. Experiment results on six benchmark SOD datasets demonstrate that the proposed method outperforms similar previous methods.