• Title/Summary/Keyword: Segmentation model

Search Result 1,031, Processing Time 0.024 seconds

Improvement of Mask-RCNN Performance Using Deep-Learning-Based Arbitrary-Scale Super-Resolution Module (딥러닝 기반 임의적 스케일 초해상도 모듈을 이용한 Mask-RCNN 성능 향상)

  • Ahn, Young-Pill;Park, Hyun-Jun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.381-388
    • /
    • 2022
  • In instance segmentation, Mask-RCNN is mostly used as a base model. Increasing the performance of Mask-RCNN is meaningful because it affects the performance of the derived model. Mask-RCNN has a transform module for unifying size of input images. In this paper, to improve the Mask-RCNN, we apply deep-learning-based ASSR to the resizing part in the transform module and inject calculated scale information into the model using IM(Integration Module). The proposed IM improves instance segmentation performance by 2.5 AP higher than Mask-RCNN in the COCO dataset, and in the periment for optimizing the IM location, the best performance was shown when it was located in the 'Top' before FPN and backbone were combined. Therefore, the proposed method can improve the performance of models using Mask-RCNN as a base model.

Lip-Synch System Optimization Using Class Dependent SCHMM (클래스 종속 반연속 HMM을 이용한 립싱크 시스템 최적화)

  • Lee, Sung-Hee;Park, Jun-Ho;Ko, Han-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.312-318
    • /
    • 2006
  • The conventional lip-synch system has a two-step process, speech segmentation and recognition. However, the difficulty of speech segmentation procedure and the inaccuracy of training data set due to the segmentation lead to a significant Performance degradation in the system. To cope with that, the connected vowel recognition method using Head-Body-Tail (HBT) model is proposed. The HBT model which is appropriate for handling relatively small sized vocabulary tasks reflects co-articulation effect efficiently. Moreover the 7 vowels are merged into 3 classes having similar lip shape while the system is optimized by employing a class dependent SCHMM structure. Additionally in both end sides of each word which has large variations, 8 components Gaussian mixture model is directly used to improve the ability of representation. Though the proposed method reveals similar performance with respect to the CHMM based on the HBT structure. the number of parameters is reduced by 33.92%. This reduction makes it a computationally efficient method enabling real time operation.

Adaptive Skin Color Segmentation in a Single Image using Image Feedback (영상 피드백을 이용한 단일 영상에서의 적응적 피부색 검출)

  • Do, Jun-Hyeong;Kim, Keun-Ho;Kim, Jong-Yeol
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.3
    • /
    • pp.112-118
    • /
    • 2009
  • Skin color segmentation techniques have been widely utilized for face/hand detection and tracking in many applications such as a diagnosis system using facial information, human-robot interaction, an image retrieval system. In case of a video image, it is common that the skin color model for a target is updated every frame for the robust target tracking against illumination change. As for a single image, however, most of studies employ a fixed skin color model which may result in low detection rate or high false positive errors. In this paper, we propose a novel method for effective skin color segmentation in a single image, which modifies the conditions for skin color segmentation iteratively by the image feedback of segmented skin color region in a given image.

Adaptive Optimal Thresholding for the Segmentation of Individual Tooth from CT Images (CT영상에서 개별 치아 분리를 위한 적응 최적 임계화 방안)

  • Heo, Hoon;Chae, Ok-Sam
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.3
    • /
    • pp.163-174
    • /
    • 2004
  • The 3D tooth model in which each tooth can be manipulated individualy is essential component for the orthodontic simulation and implant simulation in dental field. For the reconstruction of such a tooth model, we need an image segmentation algorithm capable of separating individual tooth from neighboring teeth and alveolar bone. In this paper we propose a CT image normalization method and adaptive optimal thresholding algorithm for the segmenation of tooth region in CT image slices. The proposed segmentation algorithm is based on the fact that the shape and intensity of tooth change gradually among CT image slices. It generates temporary boundary of a tooth by using the threshold value estimated in the previous imge slice, and compute histograms for the inner region and the outer region seperated by the temporary boundary. The optimal threshold value generating the finnal tooth region is computed based on these two histogram.

Deep Learning-based Rice Seed Segmentation for Phynotyping (표현체 연구를 위한 심화학습 기반 벼 종자 분할)

  • Jeong, Yu Seok;Lee, Hong Ro;Baek, Jeong Ho;Kim, Kyung Hwan;Chung, Young Suk;Lee, Chang Woo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.25 no.5
    • /
    • pp.23-29
    • /
    • 2020
  • The National Institute of Agricultural Sciences of the Rural Developement Administration (NAS, RDA) is conducting various studies on various crops, such as monitoring the cultivation environment and analyzing harvested seeds for high-throughput phenotyping. In this paper, we propose a deep learning-based rice seed segmentation method to analyze the seeds of various crops owned by the NAS. Using Mask-RCNN deep learning model, we perform the rice seed segmentation from manually taken images under specific environment (constant lighting, white background) for analyzing the seed characteristics. For this purpose, we perform the parameter tuning process of the Mask-RCNN model. By the proposed method, the results of the test on seed object detection showed that the accuracy was 82% for rice stem image and 97% for rice grain image, respectively. As a future study, we are planning to researches of more reliable seeds extraction from cluttered seed images by a deep learning-based approach and selection of high-throughput phenotype through precise data analysis such as length, width, and thickness from the detected seed objects.

A Hippocampus Segmentation in Brain MR Images using Level-Set Method (레벨 셋 방법을 이용한 뇌 MR 영상에서 해마영역 분할)

  • Lee, Young-Seung;Choi, Heung-Kook
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.9
    • /
    • pp.1075-1085
    • /
    • 2012
  • In clinical research using medical images, the image segmentation is one of the most important processes. Especially, the hippocampal atrophy is helpful for the clinical Alzheimer diagnosis as a specific marker of the progress of Alzheimer. In order to measure hippocampus volume exactly, segmentation of the hippocampus is essential. However, the hippocampus has some features like relatively low contrast, low signal-to-noise ratio, discreted boundary in MRI images, and these features make it difficult to segment hippocampus. To solve this problem, firstly, We selected region of interest from an experiment image, subtracted a original image from the negative image of the original image, enhanced contrast, and applied anisotropic diffusion filtering and gaussian filtering as preprocessing. Finally, We performed an image segmentation using two level set methods. Through a variety of approaches for the validation of proposed hippocampus segmentation method, We confirmed that our proposed method improved the rate and accuracy of the segmentation. Consequently, the proposed method is suitable for segmentation of the area which has similar features with the hippocampus. We believe that our method has great potential if successfully combined with other research findings.

Modified Pyramid Scene Parsing Network with Deep Learning based Multi Scale Attention (딥러닝 기반의 Multi Scale Attention을 적용한 개선된 Pyramid Scene Parsing Network)

  • Kim, Jun-Hyeok;Lee, Sang-Hun;Han, Hyun-Ho
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.45-51
    • /
    • 2021
  • With the development of deep learning, semantic segmentation methods are being studied in various fields. There is a problem that segmenation accuracy drops in fields that require accuracy such as medical image analysis. In this paper, we improved PSPNet, which is a deep learning based segmentation method to minimized the loss of features during semantic segmentation. Conventional deep learning based segmentation methods result in lower resolution and loss of object features during feature extraction and compression. Due to these losses, the edge and the internal information of the object are lost, and there is a problem that the accuracy at the time of object segmentation is lowered. To solve these problems, we improved PSPNet, which is a semantic segmentation model. The multi-scale attention proposed to the conventional PSPNet was added to prevent feature loss of objects. The feature purification process was performed by applying the attention method to the conventional PPM module. By suppressing unnecessary feature information, eadg and texture information was improved. The proposed method trained on the Cityscapes dataset and use the segmentation index MIoU for quantitative evaluation. As a result of the experiment, the segmentation accuracy was improved by about 1.5% compared to the conventional PSPNet.

The Market Segmentation by the Mixture Model and Characteristics of the Segmented Home-Shoppers Market (Mixture model을 이용한 홈쇼핑 이용자의 시장세분화와 세분시장의 특성: 인구통계학적변수와 구매행동변수의 통합적 사용)

  • Seo, Jeong-Ah;Lee, Jin-Hwa;Kwak, Young-Sik
    • Fashion & Textile Research Journal
    • /
    • v.10 no.5
    • /
    • pp.589-600
    • /
    • 2008
  • The purpose of the study was to segment home-shoppers by the Mixture model and to examine the characteristics of the segmented markets. Total 700 copies of questionnaires were distributed to home-shoppers more than 19 years old in Seoul and Busan and analyzed 638 copies with the Mixture model using LatentGold Program. The results of the study were as follows: In the segmented market 1, women in forties and housewives with a lowly educated person purchased for the most part from 10 A.M. to 5 P.M and the study named them as the average home shopping purchaser group. In the segmented market 2, men in twenties and students with a highly educated person often purchased with a small amount of money at 6, 7, 12 P.M and the study named them as the high-satisfaction frequent group purchasing a few goods. In the segmented market 3, professional men in forties with a highly educated person rarely purchased with a lot of amount of money from 8 P.M to 11 P.M and the study named them as low-satisfaction rare group purchasing not a few goods. Marketing strategies and discussion were suggested in detail.

Pixel-based crack image segmentation in steel structures using atrous separable convolution neural network

  • Ta, Quoc-Bao;Pham, Quang-Quang;Kim, Yoon-Chul;Kam, Hyeon-Dong;Kim, Jeong-Tae
    • Structural Monitoring and Maintenance
    • /
    • v.9 no.3
    • /
    • pp.289-303
    • /
    • 2022
  • In this study, the impact of assigned pixel labels on the accuracy of crack image identification of steel structures is examined by using an atrous separable convolution neural network (ASCNN). Firstly, images containing fatigue cracks collected from steel structures are classified into four datasets by assigning different pixel labels based on image features. Secondly, the DeepLab v3+ algorithm is used to determine optimal parameters of the ASCNN model by maximizing the average mean-intersection-over-union (mIoU) metric of the datasets. Thirdly, the ASCNN model is trained for various image sizes and hyper-parameters, such as the learning rule, learning rate, and epoch. The optimal parameters of the ASCNN model are determined based on the average mIoU metric. Finally, the trained ASCNN model is evaluated by using 10% untrained images. The result shows that the ASCNN model can segment cracks and other objects in the captured images with an average mIoU of 0.716.

Isolated Word Recognition Using Segment Probability Model (분할확률 모델을 이용한 한국어 고립단어 인식)

  • 김진영;성경모
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.12
    • /
    • pp.1541-1547
    • /
    • 1988
  • In this paper, a new model for isolated word recognition called segment probability model is proposed. The proposed model is composed of two procedures of segmentation and modelling each segment. Therefore the spoken word is devided into arbitrary segments and observation probability in each segments is obtained using vector quantization. The proposed model is compared with pattern matching method and hidden Markov model by recognition experiment. The experimental results show that the proposed model is better than exsisting methods in terms of recognition rate and caculation amounts.

  • PDF