• Title/Summary/Keyword: ResU-Net

Search Result 25, Processing Time 0.017 seconds

Development of Robust Semantic Segmentation Modeling on Various Wall Cracks (다양한 외벽에 강인한 균열 구획화 모델 개발)

  • Lee, Soo Min;Kim, Gyeong-Yeong;Kim, Dong-Ju
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.49-52
    • /
    • 2022
  • 건물 외벽에 발생하는 균열은 시설물 구조 안전에 영향을 미치며 그 크기에 따라 위험도가 달라진다. 이에 따라 전문검사관의 현장 점검을 통해 발생 균열 두께를 정밀하게 측정할 필요가 있고 최근에는 이러한 현장 안전점검에 인공지능을 도입하려는 추세다. 그러나 기존의 균열 데이터셋은 주로 콘크리트에만 한정되어 다양한 외벽에 강인한 모델을 구축하기 어렵고 균열 두께를 측정하기 위해 정확한 마스크(Mask) 정보가 필요하나 이를 만족하는 데이터셋이 부재하다. 본 논문에서는 다양한 외벽에 강인한 균열 구획화 모델을 목적으로 2,744장의 이미지를 촬영하고 매직 완드 기법으로 라벨링을 진행해 데이터셋을 구축 후, 이를 바탕으로 딥러닝 기반 균열 구획화 모델을 개발했다. UNet-ResNet50을 최종모델로 선정 및 개발 결과, 테스트 데이터셋에 대해 81.22%의 class IoU 성능을 보였다. 본 연구의 기술을 바탕으로 균열 두께를 측정하여 건축물 안전점검에 활용될 수 있기를 기대한다.

  • PDF

Personalized Speech Classification Scheme for the Smart Speaker Accessibility Improvement of the Speech-Impaired people (언어장애인의 스마트스피커 접근성 향상을 위한 개인화된 음성 분류 기법)

  • SeungKwon Lee;U-Jin Choe;Gwangil Jeon
    • Smart Media Journal
    • /
    • v.11 no.11
    • /
    • pp.17-24
    • /
    • 2022
  • With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly improved the quality of life. However, in the case of speech-impaired people, it is impossible to use the useful services of the smart speaker because they have inaccurate pronunciation due to articulation or speech disorders. In this paper, we propose a personalized voice classification technique for the speech-impaired to use for some of the functions provided by the smart speaker. The goal of this paper is to increase the recognition rate and accuracy of sentences spoken by speech-impaired people even with a small amount of data and a short learning time so that the service provided by the smart speaker can be actually used. In this paper, data augmentation and one cycle learning rate optimization technique were applied while fine-tuning ResNet18 model. Through an experiment, after recording 10 times for each 30 smart speaker commands, and learning within 3 minutes, the speech classification recognition rate was about 95.2%.

Classification of Anteroposterior/Lateral Images and Segmentation of the Radius Using Deep Learning in Wrist X-rays Images (손목 관절 단순 방사선 영상에서 딥 러닝을 이용한 전후방 및 측면 영상 분류와 요골 영역 분할)

  • Lee, Gi Pyo;Kim, Young Jae;Lee, Sanglim;Kim, Kwang Gi
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.2
    • /
    • pp.94-100
    • /
    • 2020
  • The purpose of this study was to present the models for classifying the wrist X-ray images by types and for segmenting the radius automatically in each image using deep learning and to verify the learned models. The data were a total of 904 wrist X-rays with the distal radius fracture, consisting of 472 anteroposterior (AP) and 432 lateral images. The learning model was the ResNet50 model for AP/lateral image classification, and the U-Net model for segmentation of the radius. In the model for AP/lateral image classification, 100.0% was showed in precision, recall, and F1 score and area under curve (AUC) was 1.0. The model for segmentation of the radius showed an accuracy of 99.46%, a sensitivity of 89.68%, a specificity of 99.72%, and a Dice similarity coefficient of 90.05% in AP images and an accuracy of 99.37%, a sensitivity of 88.65%, a specificity of 99.69%, and a Dice similarity coefficient of 86.05% in lateral images. The model for AP/lateral classification and the segmentation model of the radius learned through deep learning showed favorable performances to expect clinical application.

Application of CCTV Image and Semantic Segmentation Model for Water Level Estimation of Irrigation Channel (관개용수로 CCTV 이미지를 이용한 CNN 딥러닝 이미지 모델 적용)

  • Kim, Kwi-Hoon;Kim, Ma-Ga;Yoon, Pu-Reun;Bang, Je-Hong;Myoung, Woo-Ho;Choi, Jin-Yong;Choi, Gyu-Hoon
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.64 no.3
    • /
    • pp.63-73
    • /
    • 2022
  • A more accurate understanding of the irrigation water supply is necessary for efficient agricultural water management. Although we measure water levels in an irrigation canal using ultrasonic water level gauges, some errors occur due to malfunctions or the surrounding environment. This study aims to apply CNN (Convolutional Neural Network) Deep-learning-based image classification and segmentation models to the irrigation canal's CCTV (Closed-Circuit Television) images. The CCTV images were acquired from the irrigation canal of the agricultural reservoir in Cheorwon-gun, Gangwon-do. We used the ResNet-50 model for the image classification model and the U-Net model for the image segmentation model. Using the Natural Breaks algorithm, we divided water level data into 2, 4, and 8 groups for image classification models. The classification models of 2, 4, and 8 groups showed the accuracy of 1.000, 0.987, and 0.634, respectively. The image segmentation model showed a Dice score of 0.998 and predicted water levels showed R2 of 0.97 and MAE (Mean Absolute Error) of 0.02 m. The image classification models can be applied to the automatic gate-controller at four divisions of water levels. Also, the image segmentation model results can be applied to the alternative measurement for ultrasonic water gauges. We expect that the results of this study can provide a more scientific and efficient approach for agricultural water management.

Evaluating Usefulness of Deep Learning Based Left Ventricle Segmentation in Cardiac Gated Blood Pool Scan (게이트심장혈액풀검사에서 딥러닝 기반 좌심실 영역 분할방법의 유용성 평가)

  • Oh, Joo-Young;Jeong, Eui-Hwan;Lee, Joo-Young;Park, Hoon-Hee
    • Journal of radiological science and technology
    • /
    • v.45 no.2
    • /
    • pp.151-158
    • /
    • 2022
  • The Cardiac Gated Blood Pool (GBP) scintigram, a nuclear medicine imaging, calculates the left ventricular Ejection Fraction (EF) by segmenting the left ventricle from the heart. However, in order to accurately segment the substructure of the heart, specialized knowledge of cardiac anatomy is required, and depending on the expert's processing, there may be a problem in which the left ventricular EF is calculated differently. In this study, using the DeepLabV3 architecture, GBP images were trained on 93 training data with a ResNet-50 backbone. Afterwards, the trained model was applied to 23 separate test sets of GBP to evaluate the reproducibility of the region of interest and left ventricular EF. Pixel accuracy, dice coefficient, and IoU for the region of interest were 99.32±0.20, 94.65±1.45, 89.89±2.62(%) at the diastolic phase, and 99.26±0.34, 90.16±4.19, and 82.33±6.69(%) at the systolic phase, respectively. Left ventricular EF was calculated to be an average of 60.37±7.32% in the ROI set by humans and 58.68±7.22% in the ROI set by the deep learning segmentation model. (p<0.05) The automated segmentation method using deep learning presented in this study similarly predicts the average human-set ROI and left ventricular EF when a random GBP image is an input. If the automatic segmentation method is developed and applied to the functional examination method that needs to set ROI in the field of cardiac scintigram in nuclear medicine in the future, it is expected to greatly contribute to improving the efficiency and accuracy of processing and analysis by nuclear medicine specialists.