• Title/Summary/Keyword: Segmentation model

Search Result 1,063, Processing Time 0.036 seconds

Automatic Building Extraction Using SpaceNet Building Dataset and Context-based ResU-Net (SpaceNet 건물 데이터셋과 Context-based ResU-Net을 이용한 건물 자동 추출)

  • Yoo, Suhong;Kim, Cheol Hwan;Kwon, Youngmok;Choi, Wonjun;Sohn, Hong-Gyoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_2
    • /
    • pp.685-694
    • /
    • 2022
  • Building information is essential for various urban spatial analyses. For this reason, continuous building monitoring is required, but it is a subject with many practical difficulties. To this end, research is being conducted to extract buildings from satellite images that can be continuously observed over a wide area. Recently, deep learning-based semantic segmentation techniques have been used. In this study, a part of the structure of the context-based ResU-Net was modified, and training was conducted to automatically extract a building from a 30 cm Worldview-3 RGB image using SpaceNet's building v2 free open data. As a result of the classification accuracy evaluation, the f1-score, which was higher than the classification accuracy of the 2nd SpaceNet competition winners. Therefore, if Worldview-3 satellite imagery can be continuously provided, it will be possible to use the building extraction results of this study to generate an automatic model of building around the world.

Real-time Tooth Region Detection in Intraoral Scanner Images with Deep Learning (딥러닝을 이용한 구강 스캐너 이미지 내 치아 영역 실시간 검출)

  • Na-Yun, Park;Ji-Hoon Kim;Tae-Min Kim;Kyeong-Jin Song;Yu-Jin Byun;Min-Ju Kang․;Kyungkoo Jun;Jae-Gon Kim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.1-6
    • /
    • 2023
  • In the realm of dental prosthesis fabrication, obtaining accurate impressions has historically been a challenging and inefficient process, often hindered by hygiene concerns and patient discomfort. Addressing these limitations, Company D recently introduced a cutting-edge solution by harnessing the potential of intraoral scan images to create 3D dental models. However, the complexity of these scan images, encompassing not only teeth and gums but also the palate, tongue, and other structures, posed a new set of challenges. In response, we propose a sophisticated real-time image segmentation algorithm that selectively extracts pertinent data, specifically focusing on teeth and gums, from oral scan images obtained through Company D's oral scanner for 3D model generation. A key challenge we tackled was the detection of the intricate molar regions, common in dental imaging, which we effectively addressed through intelligent data augmentation for enhanced training. By placing significant emphasis on both accuracy and speed, critical factors for real-time intraoral scanning, our proposed algorithm demonstrated exceptional performance, boasting an impressive accuracy rate of 0.91 and an unrivaled FPS of 92.4. Compared to existing algorithms, our solution exhibited superior outcomes when integrated into Company D's oral scanner. This algorithm is scheduled for deployment and commercialization within Company D's intraoral scanner.

Convolutional neural networks for automated tooth numbering on panoramic radiographs: A scoping review

  • Ramadhan Hardani Putra;Eha Renwi Astuti;Aga Satria Nurrachman;Dina Karimah Putri;Ahmad Badruddin Ghazali;Tjio Andrinanti Pradini;Dhinda Tiara Prabaningtyas
    • Imaging Science in Dentistry
    • /
    • v.53 no.4
    • /
    • pp.271-281
    • /
    • 2023
  • Purpose: The objective of this scoping review was to investigate the applicability and performance of various convolutional neural network (CNN) models in tooth numbering on panoramic radiographs, achieved through classification, detection, and segmentation tasks. Materials and Methods: An online search was performed of the PubMed, Science Direct, and Scopus databases. Based on the selection process, 12 studies were included in this review. Results: Eleven studies utilized a CNN model for detection tasks, 5 for classification tasks, and 3 for segmentation tasks in the context of tooth numbering on panoramic radiographs. Most of these studies revealed high performance of various CNN models in automating tooth numbering. However, several studies also highlighted limitations of CNNs, such as the presence of false positives and false negatives in identifying decayed teeth, teeth with crown prosthetics, teeth adjacent to edentulous areas, dental implants, root remnants, wisdom teeth, and root canal-treated teeth. These limitations can be overcome by ensuring both the quality and quantity of datasets, as well as optimizing the CNN architecture. Conclusion: CNNs have demonstrated high performance in automated tooth numbering on panoramic radiographs. Future development of CNN-based models for this purpose should also consider different stages of dentition, such as the primary and mixed dentition stages, as well as the presence of various tooth conditions. Ultimately, an optimized CNN architecture can serve as the foundation for an automated tooth numbering system and for further artificial intelligence research on panoramic radiographs for a variety of purposes.

Development of wound segmentation deep learning algorithm (딥러닝을 이용한 창상 분할 알고리즘 )

  • Hyunyoung Kang;Yeon-Woo Heo;Jae Joon Jeon;Seung-Won Jung;Jiye Kim;Sung Bin Park
    • Journal of Biomedical Engineering Research
    • /
    • v.45 no.2
    • /
    • pp.90-94
    • /
    • 2024
  • Diagnosing wounds presents a significant challenge in clinical settings due to its complexity and the subjective assessments by clinicians. Wound deep learning algorithms quantitatively assess wounds, overcoming these challenges. However, a limitation in existing research is reliance on specific datasets. To address this limitation, we created a comprehensive dataset by combining open dataset with self-produced dataset to enhance clinical applicability. In the annotation process, machine learning based on Gradient Vector Flow (GVF) was utilized to improve objectivity and efficiency over time. Furthermore, the deep learning model was equipped U-net with residual blocks. Significant improvements were observed using the input dataset with images cropped to contain only the wound region of interest (ROI), as opposed to original sized dataset. As a result, the Dice score remarkably increased from 0.80 using the original dataset to 0.89 using the wound ROI crop dataset. This study highlights the need for diverse research using comprehensive datasets. In future study, we aim to further enhance and diversify our dataset to encompass different environments and ethnicities.

Image-based Soft Drink Type Classification and Dietary Assessment System Using Deep Convolutional Neural Network with Transfer Learning

  • Rubaiya Hafiz;Mohammad Reduanul Haque;Aniruddha Rakshit;Amina khatun;Mohammad Shorif Uddin
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.158-168
    • /
    • 2024
  • There is hardly any person in modern times who has not taken soft drinks instead of drinking water. The rate of people taking soft drinks being surprisingly high, researchers around the world have cautioned from time to time that these drinks lead to weight gain, raise the risk of non-communicable diseases and so on. Therefore, in this work an image-based tool is developed to monitor the nutritional information of soft drinks by using deep convolutional neural network with transfer learning. At first, visual saliency, mean shift segmentation, thresholding and noise reduction technique, collectively known as 'pre-processing' are adopted to extract the location of drinks region. After removing backgrounds and segment out only the desired area from image, we impose Discrete Wavelength Transform (DWT) based resolution enhancement technique is applied to improve the quality of image. After that, transfer learning model is employed for the classification of drinks. Finally, nutrition value of each drink is estimated using Bag-of-Feature (BoF) based classification and Euclidean distance-based ratio calculation technique. To achieve this, a dataset is built with ten most consumed soft drinks in Bangladesh. These images were collected from imageNet dataset as well as internet and proposed method confirms that it has the ability to detect and recognize different types of drinks with an accuracy of 98.51%.

Skin Color Region Segmentation using classified 3D skin (계층화된 3차원 피부색 모델을 이용한 피부색 분할)

  • Park, Gyeong-Mi;Yoon, Ga-Rim;Kim, Young-Bong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.8
    • /
    • pp.1809-1818
    • /
    • 2010
  • In order to detect the skin color area from input images, many prior researches have divided an image into the pixels having a skin color and the other pixels. In a still image or videos, it is very difficult to exactly extract the skin pixels because lighting condition and makeup generate a various variations of skin color. In this thesis, we propose a method that improves its performance using hierarchical merging of 3D skin color model and context informations for the images having various difficulties. We first make 3D color histogram distributions using skin color pixels from many YCbCr color images and then divide the color space into 3 layers including skin color region(Skin), non-skin color region(Non-skin), skin color candidate region (Skinness). When we segment the skin color region from an image, skin color pixel and non-skin color pixels are determined to skin region and non-skin region respectively. If a pixel is belong to Skinness color region, the pixels are divided into skin region or non-skin region according to the context information of its neighbors. Our proposed method can help to efficiently segment the skin color regions from images having many distorted skin colors and similar skin colors.

A Study on the Air Pollution Monitoring Network Algorithm Using Deep Learning (심층신경망 모델을 이용한 대기오염망 자료확정 알고리즘 연구)

  • Lee, Seon-Woo;Yang, Ho-Jun;Lee, Mun-Hyung;Choi, Jung-Moo;Yun, Se-Hwan;Kwon, Jang-Woo;Park, Ji-Hoon;Jung, Dong-Hee;Shin, Hye-Jung
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.11
    • /
    • pp.57-65
    • /
    • 2021
  • We propose a novel method to detect abnormal data of specific symptoms using deep learning in air pollution measurement system. Existing methods generally detect abnomal data by classifying data showing unusual patterns different from the existing time series data. However, these approaches have limitations in detecting specific symptoms. In this paper, we use DeepLab V3+ model mainly used for foreground segmentation of images, whose structure has been changed to handle one-dimensional data. Instead of images, the model receives time-series data from multiple sensors and can detect data showing specific symptoms. In addition, we improve model's performance by reducing the complexity of noisy form time series data by using 'piecewise aggregation approximation'. Through the experimental results, it can be confirmed that anomaly data detection can be performed successfully.

Comparison of Semantic Segmentation Performance of U-Net according to the Ratio of Small Objects for Nuclear Activity Monitoring (핵활동 모니터링을 위한 소형객체 비율에 따른 U-Net의 의미론적 분할 성능 비교)

  • Lee, Jinmin;Kim, Taeheon;Lee, Changhui;Lee, Hyunjin;Song, Ahram;Han, Youkyung
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_4
    • /
    • pp.1925-1934
    • /
    • 2022
  • Monitoring nuclear activity for inaccessible areas using remote sensing technology is essential for nuclear non-proliferation. In recent years, deep learning has been actively used to detect nuclear-activity-related small objects. However, high-resolution satellite imagery containing small objects can result in class imbalance. As a result, there is a performance degradation problem in detecting small objects. Therefore, this study aims to improve detection accuracy by analyzing the effect of the ratio of small objects related to nuclear activity in the input data for the performance of the deep learning model. To this end, six case datasets with different ratios of small object pixels were generated and a U-Net model was trained for each case. Following that, each trained model was evaluated quantitatively and qualitatively using a test dataset containing various types of small object classes. The results of this study confirm that when the ratio of object pixels in the input image is adjusted, small objects related to nuclear activity can be detected efficiently. This study suggests that the performance of deep learning can be improved by adjusting the object pixel ratio of input data in the training dataset.

Makeup transfer by applying a loss function based on facial segmentation combining edge with color information (에지와 컬러 정보를 결합한 안면 분할 기반의 손실 함수를 적용한 메이크업 변환)

  • Lim, So-hyun;Chun, Jun-chul
    • Journal of Internet Computing and Services
    • /
    • v.23 no.4
    • /
    • pp.35-43
    • /
    • 2022
  • Makeup is the most common way to improve a person's appearance. However, since makeup styles are very diverse, there are many time and cost problems for an individual to apply makeup directly to himself/herself.. Accordingly, the need for makeup automation is increasing. Makeup transfer is being studied for makeup automation. Makeup transfer is a field of applying makeup style to a face image without makeup. Makeup transfer can be divided into a traditional image processing-based method and a deep learning-based method. In particular, in deep learning-based methods, many studies based on Generative Adversarial Networks have been performed. However, both methods have disadvantages in that the resulting image is unnatural, the result of makeup conversion is not clear, and it is smeared or heavily influenced by the makeup style face image. In order to express the clear boundary of makeup and to alleviate the influence of makeup style facial images, this study divides the makeup area and calculates the loss function using HoG (Histogram of Gradient). HoG is a method of extracting image features through the size and directionality of edges present in the image. Through this, we propose a makeup transfer network that performs robust learning on edges.By comparing the image generated through the proposed model with the image generated through BeautyGAN used as the base model, it was confirmed that the performance of the model proposed in this study was superior, and the method of using facial information that can be additionally presented as a future study.

Sign Language Dataset Built from S. Korean Government Briefing on COVID-19 (대한민국 정부의 코로나 19 브리핑을 기반으로 구축된 수어 데이터셋 연구)

  • Sim, Hohyun;Sung, Horyeol;Lee, Seungjae;Cho, Hyeonjoong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.325-330
    • /
    • 2022
  • This paper conducts the collection and experiment of datasets for deep learning research on sign language such as sign language recognition, sign language translation, and sign language segmentation for Korean sign language. There exist difficulties for deep learning research of sign language. First, it is difficult to recognize sign languages since they contain multiple modalities including hand movements, hand directions, and facial expressions. Second, it is the absence of training data to conduct deep learning research. Currently, KETI dataset is the only known dataset for Korean sign language for deep learning. Sign language datasets for deep learning research are classified into two categories: Isolated sign language and Continuous sign language. Although several foreign sign language datasets have been collected over time. they are also insufficient for deep learning research of sign language. Therefore, we attempted to collect a large-scale Korean sign language dataset and evaluate it using a baseline model named TSPNet which has the performance of SOTA in the field of sign language translation. The collected dataset consists of a total of 11,402 image and text. Our experimental result with the baseline model using the dataset shows BLEU-4 score 3.63, which would be used as a basic performance of a baseline model for Korean sign language dataset. We hope that our experience of collecting Korean sign language dataset helps facilitate further research directions on Korean sign language.