• Title/Summary/Keyword: modified U-net

Search Result 32, Processing Time 0.024 seconds

Design of Speech Enhancement U-Net for Embedded Computing (임베디드 연산을 위한 잡음에서 음성추출 U-Net 설계)

  • Kim, Hyun-Don
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.5
    • /
    • pp.227-234
    • /
    • 2020
  • In this paper, we propose wav-U-Net to improve speech enhancement in heavy noisy environments, and it has implemented three principal techniques. First, as input data, we use 128 modified Mel-scale filter banks which can reduce computational burden instead of 512 frequency bins. Mel-scale aims to mimic the non-linear human ear perception of sound by being more discriminative at lower frequencies and less discriminative at higher frequencies. Therefore, Mel-scale is the suitable feature considering both performance and computing power because our proposed network focuses on speech signals. Second, we add a simple ResNet as pre-processing that helps our proposed network make estimated speech signals clear and suppress high-frequency noises. Finally, the proposed U-Net model shows significant performance regardless of the kinds of noise. Especially, despite using a single channel, we confirmed that it can well deal with non-stationary noises whose frequency properties are dynamically changed, and it is possible to estimate speech signals from noisy speech signals even in extremely noisy environments where noises are much lauder than speech (less than SNR 0dB). The performance on our proposed wav-U-Net was improved by about 200% on SDR and 460% on NSDR compared to the conventional Jansson's wav-U-Net. Also, it was confirmed that the processing time of out wav-U-Net with 128 modified Mel-scale filter banks was about 2.7 times faster than the common wav-U-Net with 512 frequency bins as input values.

A modified U-net for crack segmentation by Self-Attention-Self-Adaption neuron and random elastic deformation

  • Zhao, Jin;Hu, Fangqiao;Qiao, Weidong;Zhai, Weida;Xu, Yang;Bao, Yuequan;Li, Hui
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.1-16
    • /
    • 2022
  • Despite recent breakthroughs in deep learning and computer vision fields, the pixel-wise identification of tiny objects in high-resolution images with complex disturbances remains challenging. This study proposes a modified U-net for tiny crack segmentation in real-world steel-box-girder bridges. The modified U-net adopts the common U-net framework and a novel Self-Attention-Self-Adaption (SASA) neuron as the fundamental computing element. The Self-Attention module applies softmax and gate operations to obtain the attention vector. It enables the neuron to focus on the most significant receptive fields when processing large-scale feature maps. The Self-Adaption module consists of a multiplayer perceptron subnet and achieves deeper feature extraction inside a single neuron. For data augmentation, a grid-based crack random elastic deformation (CRED) algorithm is designed to enrich the diversities and irregular shapes of distributed cracks. Grid-based uniform control nodes are first set on both input images and binary labels, random offsets are then employed on these control nodes, and bilinear interpolation is performed for the rest pixels. The proposed SASA neuron and CRED algorithm are simultaneously deployed to train the modified U-net. 200 raw images with a high resolution of 4928 × 3264 are collected, 160 for training and the rest 40 for the test. 512 × 512 patches are generated from the original images by a sliding window with an overlap of 256 as inputs. Results show that the average IoU between the recognized and ground-truth cracks reaches 0.409, which is 29.8% higher than the regular U-net. A five-fold cross-validation study is performed to verify that the proposed method is robust to different training and test images. Ablation experiments further demonstrate the effectiveness of the proposed SASA neuron and CRED algorithm. Promotions of the average IoU individually utilizing the SASA and CRED module add up to the final promotion of the full model, indicating that the SASA and CRED modules contribute to the different stages of model and data in the training process.

Substitutability of Noise Reduction Algorithm based Conventional Thresholding Technique to U-Net Model for Pancreas Segmentation (이자 분할을 위한 노이즈 제거 알고리즘 기반 기존 임계값 기법 대비 U-Net 모델의 대체 가능성)

  • Sewon Lim;Youngjin Lee
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.5
    • /
    • pp.663-670
    • /
    • 2023
  • In this study, we aimed to perform a comparative evaluation using quantitative factors between a region-growing based segmentation with noise reduction algorithms and a U-Net based segmentation. Initially, we applied median filter, median modified Wiener filter, and fast non-local means algorithm to computed tomography (CT) images, followed by region-growing based segmentation. Additionally, we trained a U-Net based segmentation model to perform segmentation. Subsequently, to compare and evaluate the segmentation performance of cases with noise reduction algorithms and cases with U-Net, we measured root mean square error (RMSE) and peak signal to noise ratio (PSNR), universal quality image index (UQI), and dice similarity coefficient (DSC). The results showed that using U-Net for segmentation yielded the most improved performance. The values of RMSE, PSNR, UQI, and DSC were measured as 0.063, 72.11, 0.841, and 0.982 respectively, which indicated improvements of 1.97, 1.09, 5.30, and 1.99 times compared to noisy images. In conclusion, U-Net proved to be effective in enhancing segmentation performance compared to noise reduction algorithms in CT images.

Multi-level Skip Connection for Nested U-Net-based Speech Enhancement (중첩 U-Net 기반 음성 향상을 위한 다중 레벨 Skip Connection)

  • Seorim, Hwang;Joon, Byun;Junyeong, Heo;Jaebin, Cha;Youngcheol, Park
    • Journal of Broadcast Engineering
    • /
    • v.27 no.6
    • /
    • pp.840-847
    • /
    • 2022
  • In a deep neural network (DNN)-based speech enhancement, using global and local input speech information is closely related to model performance. Recently, a nested U-Net structure that utilizes global and local input data information using multi-scale has bee n proposed. This nested U-Net was also applied to speech enhancement and showed outstanding performance. However, a single skip connection used in nested U-Nets must be modified for the nested structure. In this paper, we propose a multi-level skip connection (MLS) to optimize the performance of the nested U-Net-based speech enhancement algorithm. As a result, the proposed MLS showed excellent performance improvement in various objective evaluation metrics compared to the standard skip connection, which means th at the MLS can optimize the performance of the nested U-Net-based speech enhancement algorithm. In addition, the final proposed m odel showed superior performance compared to other DNN-based speech enhancement models.

Flood Mapping Using Modified U-NET from TerraSAR-X Images (TerraSAR-X 영상으로부터 Modified U-NET을 이용한 홍수 매핑)

  • Yu, Jin-Woo;Yoon, Young-Woong;Lee, Eu-Ru;Baek, Won-Kyung;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1709-1722
    • /
    • 2022
  • The rise in temperature induced by global warming caused in El Nino and La Nina, and abnormally changed the temperature of seawater. Rainfall concentrates in some locations due to abnormal variations in seawater temperature, causing frequent abnormal floods. It is important to rapidly detect flooded regions to recover and prevent human and property damage caused by floods. This is possible with synthetic aperture radar. This study aims to generate a model that directly derives flood-damaged areas by using modified U-NET and TerraSAR-X images based on Multi Kernel to reduce the effect of speckle noise through various characteristic map extraction and using two images before and after flooding as input data. To that purpose, two synthetic aperture radar (SAR) images were preprocessed to generate the model's input data, which was then applied to the modified U-NET structure to train the flood detection deep learning model. Through this method, the flood area could be detected at a high level with an average F1 score value of 0.966. This result is expected to contribute to the rapid recovery of flood-stricken areas and the derivation of flood-prevention measures.

Deep Learning-based Spine Segmentation Technique Using the Center Point of the Spine and Modified U-Net (척추의 중심점과 Modified U-Net을 활용한 딥러닝 기반 척추 자동 분할)

  • Sungjoo Lim;Hwiyoung Kim
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.2
    • /
    • pp.139-146
    • /
    • 2023
  • Osteoporosis is a disease in which the risk of bone fractures increases due to a decrease in bone density caused by aging. Osteoporosis is diagnosed by measuring bone density in the total hip, femoral neck, and lumbar spine. To accurately measure bone density in the lumbar spine, the vertebral region must be segmented from the lumbar X-ray image. Deep learning-based automatic spinal segmentation methods can provide fast and precise information about the vertebral region. In this study, we used 695 lumbar spine images as training and test datasets for a deep learning segmentation model. We proposed a lumbar automatic segmentation model, CM-Net, which combines the center point of the spine and the modified U-Net network. As a result, the average Dice Similarity Coefficient(DSC) was 0.974, precision was 0.916, recall was 0.906, accuracy was 0.998, and Area under the Precision-Recall Curve (AUPRC) was 0.912. This study demonstrates a high-performance automatic segmentation model for lumbar X-ray images, which overcomes noise such as spinal fractures and implants. Furthermore, we can perform accurate measurement of bone density on lumbar X-ray images using an automatic segmentation methodology for the spine, which can prevent the risk of compression fractures at an early stage and improve the accuracy and efficiency of osteoporosis diagnosis.

Crack segmentation in high-resolution images using cascaded deep convolutional neural networks and Bayesian data fusion

  • Tang, Wen;Wu, Rih-Teng;Jahanshahi, Mohammad R.
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.221-235
    • /
    • 2022
  • Manual inspection of steel box girders on long span bridges is time-consuming and labor-intensive. The quality of inspection relies on the subjective judgements of the inspectors. This study proposes an automated approach to detect and segment cracks in high-resolution images. An end-to-end cascaded framework is proposed to first detect the existence of cracks using a deep convolutional neural network (CNN) and then segment the crack using a modified U-Net encoder-decoder architecture. A Naïve Bayes data fusion scheme is proposed to reduce the false positives and false negatives effectively. To generate the binary crack mask, first, the original images are divided into 448 × 448 overlapping image patches where these image patches are classified as cracks versus non-cracks using a deep CNN. Next, a modified U-Net is trained from scratch using only the crack patches for segmentation. A customized loss function that consists of binary cross entropy loss and the Dice loss is introduced to enhance the segmentation performance. Additionally, a Naïve Bayes fusion strategy is employed to integrate the crack score maps from different overlapping crack patches and to decide whether a pixel is crack or not. Comprehensive experiments have demonstrated that the proposed approach achieves an 81.71% mean intersection over union (mIoU) score across 5 different training/test splits, which is 7.29% higher than the baseline reference implemented with the original U-Net.

Observation of Ice Gradient in Cheonji, Baekdu Mountain Using Modified U-Net from Landsat -5/-7/-8 Images (Landsat 위성 영상으로부터 Modified U-Net을 이용한 백두산 천지 얼음변화도 관측)

  • Lee, Eu-Ru;Lee, Ha-Seong;Park, Sun-Cheon;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1691-1707
    • /
    • 2022
  • Cheonji Lake, the caldera of Baekdu Mountain, located on the border of the Korean Peninsula and China, alternates between melting and freezing seasonally. There is a magma chamber beneath Cheonji, and variations in the magma chamber cause volcanic antecedents such as changes in the temperature and water pressure of hot spring water. Consequently, there is an abnormal region in Cheonji where ice melts quicker than in other areas, freezes late even during the freezing period, and has a high-temperature water surface. The abnormal area is a discharge region for hot spring water, and its ice gradient may be used to monitor volcanic activity. However, due to geographical, political and spatial issues, periodic observation of abnormal regions of Cheonji is limited. In this study, the degree of ice change in the optimal region was quantified using a Landsat -5/-7/-8 optical satellite image and a Modified U-Net regression model. From January 22, 1985 to December 8, 2020, the Visible and Near Infrared (VNIR) band of 83 Landsat images including anomalous regions was utilized. Using the relative spectral reflectance of water and ice in the VNIR band, unique data were generated for quantitative ice variability monitoring. To preserve as much information as possible from the visible and near-infrared bands, ice gradient was noticed by applying it to U-Net with two encoders, achieving good prediction accuracy with a Root Mean Square Error (RMSE) of 140 and a correlation value of 0.9968. Since the ice change value can be seen with high precision from Landsat images using Modified U-Net in the future may be utilized as one of the methods to monitor Baekdu Mountain's volcanic activity, and a more specific volcano monitoring system can be built.

Automatic Building Extraction Using SpaceNet Building Dataset and Context-based ResU-Net (SpaceNet 건물 데이터셋과 Context-based ResU-Net을 이용한 건물 자동 추출)

  • Yoo, Suhong;Kim, Cheol Hwan;Kwon, Youngmok;Choi, Wonjun;Sohn, Hong-Gyoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_2
    • /
    • pp.685-694
    • /
    • 2022
  • Building information is essential for various urban spatial analyses. For this reason, continuous building monitoring is required, but it is a subject with many practical difficulties. To this end, research is being conducted to extract buildings from satellite images that can be continuously observed over a wide area. Recently, deep learning-based semantic segmentation techniques have been used. In this study, a part of the structure of the context-based ResU-Net was modified, and training was conducted to automatically extract a building from a 30 cm Worldview-3 RGB image using SpaceNet's building v2 free open data. As a result of the classification accuracy evaluation, the f1-score, which was higher than the classification accuracy of the 2nd SpaceNet competition winners. Therefore, if Worldview-3 satellite imagery can be continuously provided, it will be possible to use the building extraction results of this study to generate an automatic model of building around the world.

Improved Performance of Image Semantic Segmentation using NASNet (NASNet을 이용한 이미지 시맨틱 분할 성능 개선)

  • Kim, Hyoung Seok;Yoo, Kee-Youn;Kim, Lae Hyun
    • Korean Chemical Engineering Research
    • /
    • v.57 no.2
    • /
    • pp.274-282
    • /
    • 2019
  • In recent years, big data analysis has been expanded to include automatic control through reinforcement learning as well as prediction through modeling. Research on the utilization of image data is actively carried out in various industrial fields such as chemical, manufacturing, agriculture, and bio-industry. In this paper, we applied NASNet, which is an AutoML reinforced learning algorithm, to DeepU-Net neural network that modified U-Net to improve image semantic segmentation performance. We used BRATS2015 MRI data for performance verification. Simulation results show that DeepU-Net has more performance than the U-Net neural network. In order to improve the image segmentation performance, remove dropouts that are typically applied to neural networks, when the number of kernels and filters obtained through reinforcement learning in DeepU-Net was selected as a hyperparameter of neural network. The results show that the training accuracy is 0.5% and the verification accuracy is 0.3% better than DeepU-Net. The results of this study can be applied to various fields such as MRI brain imaging diagnosis, thermal imaging camera abnormality diagnosis, Nondestructive inspection diagnosis, chemical leakage monitoring, and monitoring forest fire through CCTV.