• Title/Summary/Keyword: mask R-CNN

Search Result 75, Processing Time 0.03 seconds

Deep Learning Based Digital Staining Method in Fourier Ptychographic Microscopy Image (Fourier Ptychographic Microscopy 영상에서의 딥러닝 기반 디지털 염색 방법 연구)

  • Seok-Min Hwang;Dong-Bum Kim;Yu-Jeong Kim;Yeo-Rin Kim;Jong-Ha Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.2
    • /
    • pp.97-106
    • /
    • 2022
  • In this study, H&E staining is necessary to distinguish cells. However, dyeing directly requires a lot of money and time. The purpose is to convert the phase image of unstained cells to the amplitude image of stained cells. Image data taken with FPM was created with Phase image and Amplitude image using Matlab's parameters. Through normalization, a visually identifiable image was obtained. Through normalization, a visually distinguishable image was obtained. Using the GAN algorithm, a Fake Amplitude image similar to the Real Amplitude image was created based on the Phase image, and cells were distinguished by objectification using MASK R-CNN with the Fake Amplitude image As a result of the study, D loss max is 3.3e-1, min is 6.8e-2, G loss max is 6.9e-2, min is 2.9e-2, A loss max is 5.8e-1, min is 1.2e-1, Mask R-CNN max is 1.9e0, and min is 3.2e-1.

Analysis of the Effect of Learned Image Scale and Season on Accuracy in Vehicle Detection by Mask R-CNN (Mask R-CNN에 의한 자동차 탐지에서 학습 영상 화면 축척과 촬영계절이 정확도에 미치는 영향 분석)

  • Choi, Jooyoung;Won, Taeyeon;Eo, Yang Dam
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.1
    • /
    • pp.15-22
    • /
    • 2022
  • In order to improve the accuracy of the deep learning object detection technique, the effect of magnification rate conditions and seasonal factors on detection accuracy in aerial photographs and drone images was analyzed through experiments. Among the deep learning object detection techniques, Mask R-CNN, which shows fast learning speed and high accuracy, was used to detect the vehicle to be detected in pixel units. Through Seoul's aerial photo service, learning images were captured at different screen magnifications, and the accuracy was analyzed by learning each. According to the experimental results, the higher the magnification level, the higher the mAP average to 60%, 67%, and 75%. When the magnification rates of train and test data of the data set were alternately arranged, low magnification data was arranged as train data, and high magnification data was arranged as test data, showing a difference of more than 20% compared to the opposite case. And in the case of drone images with a seasonal difference with a time difference of 4 months, the results of learning the image data at the same period showed high accuracy with an average of 93%, confirming that seasonal differences also affect learning.

A Comparative Study on Performance of Deep Learning Models for Vision-based Concrete Crack Detection according to Model Types (영상기반 콘크리트 균열 탐지 딥러닝 모델의 유형별 성능 비교)

  • Kim, Byunghyun;Kim, Geonsoon;Jin, Soomin;Cho, Soojin
    • Journal of the Korean Society of Safety
    • /
    • v.34 no.6
    • /
    • pp.50-57
    • /
    • 2019
  • In this study, various types of deep learning models that have been proposed recently are classified according to data input / output types and analyzed to find the deep learning model suitable for constructing a crack detection model. First the deep learning models are classified into image classification model, object segmentation model, object detection model, and instance segmentation model. ResNet-101, DeepLab V2, Faster R-CNN, and Mask R-CNN were selected as representative deep learning model of each type. For the comparison, ResNet-101 was implemented for all the types of deep learning model as a backbone network which serves as a main feature extractor. The four types of deep learning models were trained with 500 crack images taken from real concrete structures and collected from the Internet. The four types of deep learning models showed high accuracy above 94% during the training. Comparative evaluation was conducted using 40 images taken from real concrete structures. The performance of each type of deep learning model was measured using precision and recall. In the experimental result, Mask R-CNN, an instance segmentation deep learning model showed the highest precision and recall on crack detection. Qualitative analysis also shows that Mask R-CNN could detect crack shapes most similarly to the real crack shapes.

Implementation of CNN-based Masking Algorithm for Post Processing of Aerial Image

  • CHOI, Eunsoo;QUAN, Zhixuan;JUNG, Sangwoo
    • Korean Journal of Artificial Intelligence
    • /
    • v.9 no.2
    • /
    • pp.7-14
    • /
    • 2021
  • Purpose: To solve urban problems, empirical research is being actively conducted to implement a smart city based on various ICT technologies, and digital twin technology is needed to effectively implement a smart city. A digital twin is essential for the realization of a smart city. A digital twin is a virtual environment that intuitively visualizes multidimensional data in the real world based on 3D. Digital twin is implemented on the premise of the convergence of GIS and BIM, and in particular, a lot of time is invested in data pre-processing and labeling in the data construction process. In digital twin, data quality is prioritized for consistency with reality, but there is a limit to data inspection with the naked eye. Therefore, in order to improve the required time and quality of digital twin construction, it was attempted to detect a building using Mask R-CNN, a deep learning-based masking algorithm for aerial images. If the results of this study are advanced and used to build digital twin data, it is thought that a high-quality smart city can be realized.

Performance Improvement of Object Segmentation Using ESRGAN and Semantic Soft Segmentation (ESRGAN과 Semantic Soft Segmentation을 이용한 객체 분할의 성능 개선)

  • Yoon, DongSik;Kwak, Noyoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.468-471
    • /
    • 2020
  • 본 논문은 ESRGAN(Enhanced Super Resolution GAN)과 Semantic Soft Segmentation을 이용한 객체 분할의 성능 개선에 관한 것이다. 본 논문의 연구진이 이미 제안한 Mask R-CNN과 Semantic Soft Segmentation을 이용한 객체 분할 방법은 전반적으로 객체 분할 성능이 양호한 반면, 객체의 크기가 상대적으로 작으면 분할 성능이 저조해지는 문제점이 있었다. 본 논문은 이러한 문제점을 해결하기 위한 것으로, Mask R-CNN을 통해 검출된 객체의 크기가 일정 기준치 이하인 경우, ESRGAN을 통해 초해상화를 수행한 후, Semantic Soft Segmentation을 수행함으로써 소형 객체의 분할 성능을 개선함에 그 목적이 있다. 제안된 방법에 따르면, 기존의 방볍에 비해 크기가 작은 객체의 분할 특성을 좀 더 효과적으로 개선할 수 있음을 확인할 수 있었다.

The Detection of Multi-class Vehicles using Swin Transformer (Swin Transformer를 이용한 항공사진에서 다중클래스 차량 검출)

  • Lee, Ki-chun;Jeong, Yu-seok;Lee, Chang-woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.112-114
    • /
    • 2021
  • In order to detect urban conditions, the number of means of transportation and traffic flow are essential factors to be identified. This paper improved the detection system capabilities shown in previous studies using the SwinTransformer model, which showed higher performance than existing convolutional neural networks, by learning various vehicle types using existing Mask R-CNN and introducing today's widely used transformer model to detect certain types of vehicles in urban aerial images.

  • PDF

Abnormal Behavior Detection and Localization Using Aspect Ratio Based on Mask R-CNN (Mask R-CNN 기반 Aspect Ratio를 활용한 이상행동 검출 및 영역화 방법)

  • Lim, Hyunseok;Hu, Xufeng;Gwak, Jeonghwan
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.99-101
    • /
    • 2022
  • 이상 행동을 탐지하는 딥러닝 기반 검지 시스템은 동영상 기반 데이터로부터 움직임을 보이는 객체를 추적하고 그 객체의 행동을 분석하여 정상적인 행동 범위를 벗어나는 패턴을 보이는 영역을 이상으로 탐지한다. 특히 생성적 적대 신경망(GAN)과 광학 흐름 추정(Optical flow estimation) 기법을 활용하여 움직임에 대한 특징 정보를 추출하고 이를 학습하여 행동 패턴에 대한 모델링을 수행한다. 모델 학습 및 테스트에 활용되는 데이터셋의 해상도가 낮거나 이상 행동을 표현하는 특징 정보가 부족할 경우 최종 모델 성능에 부정적 영향을 미치게 되며, 특히 광학 흐름이 표현하는 이동량 측면에서 차이가 크게 나지 않는 이상 객체의 경우 탐지가 정확하게 이뤄지지 않는다. 본 연구에서는 동영상 프레임에서 나타나는 객체의 평균 종횡비를 구하고 정상적인 비율을 벗어나는 객체에 대해서 이상 행동을 취하는 샘플로 처리하는 후처리단 모듈을 제안하여 최종적인 모델 성능을 향상시키는 방법을 고안한다.

  • PDF

Development of Deep Learning-based Land Monitoring Web Service (딥러닝 기반의 국토모니터링 웹 서비스 개발)

  • In-Hak Kong;Dong-Hoon Jeong;Gu-Ha Jeong
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.275-284
    • /
    • 2023
  • Land monitoring involves systematically understanding changes in land use, leveraging spatial information such as satellite imagery and aerial photographs. Recently, the integration of deep learning technologies, notably object detection and semantic segmentation, into land monitoring has spurred active research. This study developed a web service to facilitate such integrations, allowing users to analyze aerial and drone images using CNN models. The web service architecture comprises AI, WEB/WAS, and DB servers and employs three primary deep learning models: DeepLab V3, YOLO, and Rotated Mask R-CNN. Specifically, YOLO offers rapid detection capabilities, Rotated Mask R-CNN excels in detecting rotated objects, while DeepLab V3 provides pixel-wise image classification. The performance of these models fluctuates depending on the quantity and quality of the training data. Anticipated to be integrated into the LX Corporation's operational network and the Land-XI system, this service is expected to enhance the accuracy and efficiency of land monitoring.

Realtime Theft Detection of Registered and Unregistered Objects in Surveillance Video (감시 비디오에서 등록 및 미등록 물체의 실시간 도난 탐지)

  • Park, Hyeseung;Park, Seungchul;Joo, Youngbok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.10
    • /
    • pp.1262-1270
    • /
    • 2020
  • Recently, the smart video surveillance research, which has been receiving increasing attention, has mainly focused on the intruder detection and tracking, and abandoned object detection. On the other hand, research on real-time detection of stolen objects is relatively insufficient compared to its importance. Considering various smart surveillance video application environments, this paper presents two different types of stolen object detection algorithms. We first propose an algorithm that detects theft of statically and dynamically registered surveillance objects using a dual background subtraction model. In addition, we propose another algorithm that detects theft of general surveillance objects by applying the dual background subtraction model and Mask R-CNN-based object segmentation technology. The former algorithm can provide economical theft detection service for pre-registered surveillance objects in low computational power environments, and the latter algorithm can be applied to the theft detection of a wider range of general surveillance objects in environments capable of providing sufficient computational power.

Crack Inspection and Mapping of Concrete Bridges using Integrated Image Processing Techniques (통합 이미지 처리 기술을 이용한 콘크리트 교량 균열 탐지 및 매핑)

  • Kim, Byunghyun;Cho, Soojin
    • Journal of the Korean Society of Safety
    • /
    • v.36 no.1
    • /
    • pp.18-25
    • /
    • 2021
  • In many developed countries, such as South Korea, efficiently maintaining the aging infrastructures is an important issue. Currently, inspectors visually inspect the infrastructure for maintenance needs, but this method is inefficient due to its high costs, long logistic times, and hazards to the inspectors. Thus, in this paper, a novel crack inspection approach for concrete bridges is proposed using integrated image processing techniques. The proposed approach consists of four steps: (1) training a deep learning model to automatically detect cracks on concrete bridges, (2) acquiring in-situ images using a drone, (3) generating orthomosaic images based on 3D modeling, and (4) detecting cracks on the orthmosaic image using the trained deep learning model. Cascade Mask R-CNN, a state-of-the-art instance segmentation deep learning model, was trained with 3235 crack images that included 2415 hard negative images. We selected the Tancheon overpass, located in Seoul, South Korea, as a testbed for the proposed approach, and we captured images of pier 34-37 and slab 34-36 using a commercial drone. Agisoft Metashape was utilized as a 3D model generation program to generate an orthomosaic of the captured images. We applied the proposed approach to four orthomosaic images that displayed the front, back, left, and right sides of pier 37. Using pixel-level precision referencing visual inspection of the captured images, we evaluated the trained Cascade Mask R-CNN's crack detection performance. At the coping of the front side of pier 37, the model obtained its best precision: 94.34%. It achieved an average precision of 72.93% for the orthomosaics of the four sides of the pier. The test results show that this proposed approach for crack detection can be a suitable alternative to the conventional visual inspection method.