• 제목/요약/키워드: U-Net Architecture

검색결과 42건 처리시간 0.021초

비전 트랜스포머 인코더가 포함된 U-net을 이용한 대장 내시경 이미지의 폴립 분할 (U-net with vision transformer encoder for polyp segmentation in colonoscopy images)

  • 겔란 아야나;최세운
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2022년도 추계학술대회
    • /
    • pp.97-99
    • /
    • 2022
  • 대장암의 조기 발견과 치료를 위해서는 정확한 폴립의 분할이 중요하나 다음과 같은 제약이 따른다. 개별 폴립의 위치, 크기 및 모양이 서로 상이하며, 모션 흐림 및 빛 반사와 같은 특정 상황에서 폴립과 주변 환경 간에 상당한 정도의 유사성이 존재한다. 인코더와 디코더 역할을 하는 Convolutional Neural Networks로 구성된 U-net은 이러한 한계를 극복하기 위해 다양하게 사용된다. 본 연구는 보다 정확한 폴립 분할을 위한 비전트랜스포머가 포함된 U-net 아키텍처를 제안하였고, 그 결과 제안된 방식은 표준 U-net 아키텍처보다 더 나은 성능을 보였음을 확인할 수 있었다.

  • PDF

Multi-scale U-SegNet architecture with cascaded dilated convolutions for brain MRI Segmentation

  • 챠이트라 다야난다;이범식
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송∙미디어공학회 2020년도 추계학술대회
    • /
    • pp.25-28
    • /
    • 2020
  • Automatic segmentation of brain tissues such as WM, GM, and CSF from brain MRI scans is helpful for the diagnosis of many neurological disorders. Accurate segmentation of these brain structures is a very challenging task due to low tissue contrast, bias filed, and partial volume effects. With the aim to improve brain MRI segmentation accuracy, we propose an end-to-end convolutional based U-SegNet architecture designed with multi-scale kernels, which includes cascaded dilated convolutions for the task of brain MRI segmentation. The multi-scale convolution kernels are designed to extract abundant semantic features and capture context information at different scales. Further, the cascaded dilated convolution scheme helps to alleviate the vanishing gradient problem in the proposed model. Experimental outcomes indicate that the proposed architecture is superior to the traditional deep-learning methods such as Segnet, U-net, and U-Segnet and achieves high performance with an average DSC of 93% and 86% of JI value for brain MRI segmentation.

  • PDF

폐 CT 영상에서의 노이즈 감소를 위한 U-net 딥러닝 모델의 다양한 학습 파라미터 적용에 따른 성능 평가 (Performance Evaluation of U-net Deep Learning Model for Noise Reduction according to Various Hyper Parameters in Lung CT Images)

  • 이민관;박찬록
    • 한국방사선학회논문지
    • /
    • 제17권5호
    • /
    • pp.709-715
    • /
    • 2023
  • 본 연구의 목적은, U-net 딥러닝 모델을 이용하여 CT 영상에서의 노이즈 감소 효과를 다양한 하이퍼 파라미터를 적용하여 평가하였다. 노이즈가 포함된 입력 영상 생성을 위하여 Gaussian 노이즈를 적용하였고, 총 1300장의 CT 영상에서 train, validation, test 셋의 비율을 8:1:1로 유지하여 U-net 모델을 적용하여 학습하였다. 연구에서 적용된 하이퍼파라미터는 최적화 함수 Adagrad, Adam, AdamW와 학습횟수 10회, 50회, 100회와 학습률 0.01, 0.001, 0.0001을 적용하였으며, 최대 신호 대 잡음비와 영상의 변동계수 값을 계산하여 정량적으로 분석하였다. 결과적으로 U-net 딥러닝 모델을 적용한 노이즈 감소는 영상의 질을 향상시킬 수 있으며 노이즈 감소 측면에서 유용성을 입증하였다.

폐 CT 영상에서 다양한 노이즈 타입에 따른 딥러닝 네트워크를 이용한 영상의 질 향상에 관한 연구 (Study on the Improvement of Lung CT Image Quality using 2D Deep Learning Network according to Various Noise Types)

  • 이민관;박찬록
    • 한국방사선학회논문지
    • /
    • 제18권2호
    • /
    • pp.93-99
    • /
    • 2024
  • 디지털 영상, 특히, 전산화 단층촬영 영상은 X선 신호를 디지털 영상 신호로 변환하는 과정에서 노이즈가 필수적으로 포함되기 때문에 노이즈 저감화에 대한 고려가 필수적이다. 최근, 딥러닝 모델 기반의 노이즈 감소가 가능한 연구가 수행되고 있다. 그러므로, 본 연구의 목적은 폐 CT 영상에서의 다양한 종류의 노이즈를 U-net 딥러닝 모델을 이용하여 노이즈 감소 효과를 평가하였다. 총 800장의 폐 CT 영상을 사용하였고, Adam 최적화 함수와 100회의 반복 학습 횟수, 0.0001의 학습률을 적용한 U-net 모델을 이용하였다. 노이즈를 포함한 입력 영상 생성을 위하여 Gaussian 노이즈, Poisson 노이즈, salt & pepper 노이즈, speckle 노이즈를 적용하였다. 정량적 분석 인자로 평균 제곱 오차, 최대 신호 대 잡음비, 영상의 변동계수를 사용하여 분석하였다. 결과적으로, U-net 네트워크는 다양한 노이즈 조건에서 우수한 성능을 나타냈으며 그 효용성을 입증하였다.

임베디드 보드에서 실시간 의미론적 분할을 위한 심층 신경망 구조 (A Deep Neural Network Architecture for Real-Time Semantic Segmentation on Embedded Board)

  • 이준엽;이영완
    • 정보과학회 논문지
    • /
    • 제45권1호
    • /
    • pp.94-98
    • /
    • 2018
  • 본 논문은 자율주행을 위한 실시간 의미론적 분할 방법으로 최적화된 심층 신경망 구조인 Wide Inception ResNet (WIR Net)을 제안한다. 신경망 구조는 Residual connection과 Inception module을 적용하여 특징을 추출하는 인코더와 Transposed convolution과 낮은 층의 특징 맵을 사용하여 해상도를 높이는 디코더로 구성하였고 ELU 활성화 함수를 적용함으로써 성능을 올렸다. 또한 신경망의 전체 층수를 줄이고 필터 수를 늘리는 방법을 통해 성능을 최적화하였다. 성능평가는 NVIDIA Geforce gtx 1080과 TX1 보드를 사용하여 주행환경의 Cityscapes 데이터에 대해 클래스와 카테고리별 IoU를 평가하였다. 실험 결과를 통해 클래스 IoU 53.4, 카테고리 IoU 81.8의 정확도와 TX1 보드에서 $640{\times}360$, $720{\times}480$ 해상도 영상처리에 17.8fps, 13.0fps의 실행속도를 보여주는 것을 확인하였다.

Comparing U-Net convolutional network with mask R-CNN in Nuclei Segmentation

  • Zanaty, E.A.;Abdel-Aty, Mahmoud M.;ali, Khalid abdel-wahab
    • International Journal of Computer Science & Network Security
    • /
    • 제22권3호
    • /
    • pp.273-275
    • /
    • 2022
  • Deep Learning is used nowadays in Nuclei segmentation. While recent developments in theory and open-source software have made these tools easier to implement, expert knowledge is still required to choose the exemplary model architecture and training setup. We compare two popular segmentation frameworks, U-Net and Mask-RCNN, in the nuclei segmentation task and find that they have different strengths and failures. we compared both models aiming for the best nuclei segmentation performance. Experimental Results of Nuclei Medical Images Segmentation using U-NET algorithm Outperform Mask R-CNN Algorithm.

Automatic Extraction of Liver Region from Medical Images by Using an MFUnet

  • Vi, Vo Thi Tuong;Oh, A-Ran;Lee, Guee-Sang;Yang, Hyung-Jeong;Kim, Soo-Hyung
    • 스마트미디어저널
    • /
    • 제9권3호
    • /
    • pp.59-70
    • /
    • 2020
  • This paper presents a fully automatic tool to recognize the liver region from CT images based on a deep learning model, namely Multiple Filter U-net, MFUnet. The advantages of both U-net and Multiple Filters were utilized to construct an autoencoder model, called MFUnet for segmenting the liver region from computed tomograph. The MFUnet architecture includes the autoencoding model which is used for regenerating the liver region, the backbone model for extracting features which is trained on ImageNet, and the predicting model used for liver segmentation. The LiTS dataset and Chaos dataset were used for the evaluation of our research. This result shows that the integration of Multiple Filter to U-net improves the performance of liver segmentation and it opens up many research directions in medical imaging processing field.

Enhanced CNN Model for Brain Tumor Classification

  • Kasukurthi, Aravinda;Paleti, Lakshmikanth;Brahmaiah, Madamanchi;Sree, Ch.Sudha
    • International Journal of Computer Science & Network Security
    • /
    • 제22권5호
    • /
    • pp.143-148
    • /
    • 2022
  • Brain tumor classification is an important process that allows doctors to plan treatment for patients based on the stages of the tumor. To improve classification performance, various CNN-based architectures are used for brain tumor classification. Existing methods for brain tumor segmentation suffer from overfitting and poor efficiency when dealing with large datasets. The enhanced CNN architecture proposed in this study is based on U-Net for brain tumor segmentation, RefineNet for pattern analysis, and SegNet architecture for brain tumor classification. The brain tumor benchmark dataset was used to evaluate the enhanced CNN model's efficiency. Based on the local and context information of the MRI image, the U-Net provides good segmentation. SegNet selects the most important features for classification while also reducing the trainable parameters. In the classification of brain tumors, the enhanced CNN method outperforms the existing methods. The enhanced CNN model has an accuracy of 96.85 percent, while the existing CNN with transfer learning has an accuracy of 94.82 percent.

Fast and Accurate Single Image Super-Resolution via Enhanced U-Net

  • Chang, Le;Zhang, Fan;Li, Biao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권4호
    • /
    • pp.1246-1262
    • /
    • 2021
  • Recent studies have demonstrated the strong ability of deep convolutional neural networks (CNNs) to significantly boost the performance in single image super-resolution (SISR). The key concern is how to efficiently recover and utilize diverse information frequencies across multiple network layers, which is crucial to satisfying super-resolution image reconstructions. Hence, previous work made great efforts to potently incorporate hierarchical frequencies through various sophisticated architectures. Nevertheless, economical SISR also requires a capable structure design to balance between restoration accuracy and computational complexity, which is still a challenge for existing techniques. In this paper, we tackle this problem by proposing a competent architecture called Enhanced U-Net Network (EUN), which can yield ready-to-use features in miscellaneous frequencies and combine them comprehensively. In particular, the proposed building block for EUN is enhanced from U-Net, which can extract abundant information via multiple skip concatenations. The network configuration allows the pipeline to propagate information from lower layers to higher ones. Meanwhile, the block itself is committed to growing quite deep in layers, which empowers different types of information to spring from a single block. Furthermore, due to its strong advantage in distilling effective information, promising results are guaranteed with comparatively fewer filters. Comprehensive experiments manifest our model can achieve favorable performance over that of state-of-the-art methods, especially in terms of computational efficiency.

DP-LinkNet: A convolutional network for historical document image binarization

  • Xiong, Wei;Jia, Xiuhong;Yang, Dichun;Ai, Meihui;Li, Lirong;Wang, Song
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권5호
    • /
    • pp.1778-1797
    • /
    • 2021
  • Document image binarization is an important pre-processing step in document analysis and archiving. The state-of-the-art models for document image binarization are variants of encoder-decoder architectures, such as FCN (fully convolutional network) and U-Net. Despite their success, they still suffer from three limitations: (1) reduced feature map resolution due to consecutive strided pooling or convolutions, (2) multiple scales of target objects, and (3) reduced localization accuracy due to the built-in invariance of deep convolutional neural networks (DCNNs). To overcome these three challenges, we propose an improved semantic segmentation model, referred to as DP-LinkNet, which adopts the D-LinkNet architecture as its backbone, with the proposed hybrid dilated convolution (HDC) and spatial pyramid pooling (SPP) modules between the encoder and the decoder. Extensive experiments are conducted on recent document image binarization competition (DIBCO) and handwritten document image binarization competition (H-DIBCO) benchmark datasets. Results show that our proposed DP-LinkNet outperforms other state-of-the-art techniques by a large margin. Our implementation and the pre-trained models are available at https://github.com/beargolden/DP-LinkNet.