• Title/Summary/Keyword: Deep CNN

Search Result 1,171, Processing Time 0.026 seconds

Construction Method of ECVAM using Land Cover Map and KOMPSAT-3A Image (토지피복지도와 KOMPSAT-3A위성영상을 활용한 환경성평가지도의 구축)

  • Kwon, Hee Sung;Song, Ah Ram;Jung, Se Jung;Lee, Won Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.5
    • /
    • pp.367-380
    • /
    • 2022
  • In this study, the periodic and simplified update and production way of the ECVAM (Environmental Conservation Value Assessment Map) was presented through the classification of environmental values using KOMPSAT-3A satellite imagery and land cover map. ECVAM is a map that evaluates the environmental value of the country in five stages based on 62 legal evaluation items and 8 environmental and ecological evaluation items, and is provided on two scales: 1:25000 and 1:5000. However, the 1:5000 scale environmental assessment map is being produced and serviced with a slow renewal cycle of one year due to various constraints such as the absence of reference materials and different production years. Therefore, in this study, one of the deep learning techniques, KOMPSAT-3A satellite image, SI (Spectral Indices), and land cover map were used to conduct this study to confirm the possibility of establishing an environmental assessment map. As a result, the accuracy was calculated to be 87.25% and 85.88%, respectively. Through the results of the study, it was possible to confirm the possibility of constructing an environmental assessment map using satellite imagery, optical index, and land cover classification.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

Development of Intelligent Severity of Atopic Dermatitis Diagnosis Model using Convolutional Neural Network (합성곱 신경망(Convolutional Neural Network)을 활용한 지능형 아토피피부염 중증도 진단 모델 개발)

  • Yoon, Jae-Woong;Chun, Jae-Heon;Bang, Chul-Hwan;Park, Young-Min;Kim, Young-Joo;Oh, Sung-Min;Jung, Joon-Ho;Lee, Suk-Jun;Lee, Ji-Hyun
    • Management & Information Systems Review
    • /
    • v.36 no.4
    • /
    • pp.33-51
    • /
    • 2017
  • With the advent of 'The Forth Industrial Revolution' and the growing demand for quality of life due to economic growth, needs for the quality of medical services are increasing. Artificial intelligence has been introduced in the medical field, but it is rarely used in chronic skin diseases that directly affect the quality of life. Also, atopic dermatitis, a representative disease among chronic skin diseases, has a disadvantage in that it is difficult to make an objective diagnosis of the severity of lesions. The aim of this study is to establish an intelligent severity recognition model of atopic dermatitis for improving the quality of patient's life. For this, the following steps were performed. First, image data of patients with atopic dermatitis were collected from the Catholic University of Korea Seoul Saint Mary's Hospital. Refinement and labeling were performed on the collected image data to obtain training and verification data that suitable for the objective intelligent atopic dermatitis severity recognition model. Second, learning and verification of various CNN algorithms are performed to select an image recognition algorithm that suitable for the objective intelligent atopic dermatitis severity recognition model. Experimental results showed that 'ResNet V1 101' and 'ResNet V2 50' were measured the highest performance with Erythema and Excoriation over 90% accuracy, and 'VGG-NET' was measured 89% accuracy lower than the two lesions due to lack of training data. The proposed methodology demonstrates that the image recognition algorithm has high performance not only in the field of object recognition but also in the medical field requiring expert knowledge. In addition, this study is expected to be highly applicable in the field of atopic dermatitis due to it uses image data of actual atopic dermatitis patients.

  • PDF

Automatic Sagittal Plane Detection for the Identification of the Mandibular Canal (치아 신경관 식별을 위한 자동 시상면 검출법)

  • Pak, Hyunji;Kim, Dongjoon;Shin, Yeong-Gil
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.3
    • /
    • pp.31-37
    • /
    • 2020
  • Identification of the mandibular canal path in Computed Tomography (CT) scans is important in dental implantology. Typically, prior to the implant planning, dentists find a sagittal plane where the mandibular canal path is maximally observed, to manually identify the mandibular canal. However, this is time-consuming and requires extensive experience. In this paper, we propose a deep-learning-based framework to detect the desired sagittal plane automatically. This is accomplished by utilizing two main techniques: 1) a modified version of the iterative transformation network (ITN) method for obtaining initial planes, and 2) a fine searching method based on a convolutional neural network (CNN) classifier for detecting the desirable sagittal plane. This combination of techniques facilitates accurate plane detection, which is a limitation of the stand-alone ITN method. We have tested on a number of CT datasets to demonstrate that the proposed method can achieve more satisfactory results compared to the ITN method. This allows dentists to identify the mandibular canal path efficiently, providing a foundation for future research into more efficient, automatic mandibular canal detection methods.

A Study on the Improvement of Source Code Static Analysis Using Machine Learning (기계학습을 이용한 소스코드 정적 분석 개선에 관한 연구)

  • Park, Yang-Hwan;Choi, Jin-Young
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1131-1139
    • /
    • 2020
  • The static analysis of the source code is to find the remaining security weaknesses for a wide range of source codes. The static analysis tool is used to check the result, and the static analysis expert performs spying and false detection analysis on the result. In this process, the amount of analysis is large and the rate of false positives is high, so a lot of time and effort is required, and a method of efficient analysis is required. In addition, it is rare for experts to analyze only the source code of the line where the defect occurred when performing positive/false detection analysis. Depending on the type of defect, the surrounding source code is analyzed together and the final analysis result is delivered. In order to solve the difficulty of experts discriminating positive and false positives using these static analysis tools, this paper proposes a method of determining whether or not the security weakness found by the static analysis tools is a spy detection through artificial intelligence rather than an expert. In addition, the optimal size was confirmed through an experiment to see how the size of the training data (source code around the defects) used for such machine learning affects the performance. This result is expected to help the static analysis expert's job of classifying positive and false positives after static analysis.

Object Detection Based on Hellinger Distance IoU and Objectron Application (Hellinger 거리 IoU와 Objectron 적용을 기반으로 하는 객체 감지)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.2
    • /
    • pp.63-70
    • /
    • 2022
  • Although 2D Object detection has been largely improved in the past years with the advance of deep learning methods and the use of large labeled image datasets, 3D object detection from 2D imagery is a challenging problem in a variety of applications such as robotics, due to the lack of data and diversity of appearances and shapes of objects within a category. Google has just announced the launch of Objectron that has a novel data pipeline using mobile augmented reality session data. However, it also is corresponding to 2D-driven 3D object detection technique. This study explores more mature 2D object detection method, and applies its 2D projection to Objectron 3D lifting system. Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a stochastic representation of object regions using Gaussian distributions. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a stochastic Intersection-over-Union. Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in available datasets. Thus, less accuracy problem that is one of several limitations of Objectron can be relaxed.

Comparative study of data augmentation methods for fake audio detection (음성위조 탐지에 있어서 데이터 증강 기법의 성능에 관한 비교 연구)

  • KwanYeol Park;Il-Youp Kwak
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • The data augmentation technique is effectively used to solve the problem of overfitting the model by allowing the training dataset to be viewed from various perspectives. In addition to image augmentation techniques such as rotation, cropping, horizontal flip, and vertical flip, occlusion-based data augmentation methods such as Cutmix and Cutout have been proposed. For models based on speech data, it is possible to use an occlusion-based data-based augmentation technique after converting a 1D speech signal into a 2D spectrogram. In particular, SpecAugment is an occlusion-based augmentation technique for speech spectrograms. In this study, we intend to compare and study data augmentation techniques that can be used in the problem of false-voice detection. Using data from the ASVspoof2017 and ASVspoof2019 competitions held to detect fake audio, a dataset applied with Cutout, Cutmix, and SpecAugment, an occlusion-based data augmentation method, was trained through an LCNN model. All three augmentation techniques, Cutout, Cutmix, and SpecAugment, generally improved the performance of the model. In ASVspoof2017, Cutmix, in ASVspoof2019 LA, Mixup, and in ASVspoof2019 PA, SpecAugment showed the best performance. In addition, increasing the number of masks for SpecAugment helps to improve performance. In conclusion, it is understood that the appropriate augmentation technique differs depending on the situation and data.

A study on discharge estimation for the event using a deep learning algorithm (딥러닝 알고리즘을 이용한 강우 발생시의 유량 추정에 관한 연구)

  • Song, Chul Min
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.246-246
    • /
    • 2021
  • 본 연구는 강우 발생시 유량을 추정하는 것에 목적이 있다. 이를 위해 본 연구는 선행연구의 모형 개발방법론에서 벗어나 딥러닝 알고리즘 중 하나인 합성곱 신경망 (convolution neural network)과 수문학적 이미지 (hydrological image)를 이용하여 강우 발생시 유량을 추정하였다. 합성곱 신경망은 일반적으로 분류 문제 (classification)을 해결하기 위한 목적으로 개발되었기 때문에 불특정 연속변수인 유량을 모의하기에는 적합하지 않다. 이를 위해 본 연구에서는 합성곱 신경망의 완전 연결층 (Fully connected layer)를 개선하여 연속변수를 모의할 수 있도록 개선하였다. 대부분 합성곱 신경망은 RGB (red, green, blue) 사진 (photograph)을 이용하여 해당 사진이 나타내는 것을 예측하는 목적으로 사용하지만, 본 연구의 경우 일반 RGB 사진을 이용하여 유출량을 예측하는 것은 경험적 모형의 전제(독립변수와 종속변수의 관계)를 무너뜨리는 결과를 초래할 수 있다. 이를 위해 본 연구에서는 임의의 유역에 대해 2차원 공간에서 무차원의 수문학적 속성을 갖는 grid의 집합으로 정의되는 수문학적 이미지는 입력자료로 활용했다. 합성곱 신경망의 구조는 Convolution Layer와 Pulling Layer가 5회 반복하는 구조로 설정하고, 이후 Flatten Layer, 2개의 Dense Layer, 1개의 Batch Normalization Layer를 배열하고, 다시 1개의 Dense Layer가 이어지는 구조로 설계하였다. 마지막 Dense Layer의 활성화 함수는 분류모형에 이용되는 softmax 또는 sigmoid 함수를 대신하여 회귀모형에서 자주 사용되는 Linear 함수로 설정하였다. 이와 함께 각 층의 활성화 함수는 정규화 선형함수 (ReLu)를 이용하였으며, 모형의 학습 평가 및 검정을 판단하기 위해 MSE 및 MAE를 사용했다. 또한, 모형평가는 NSE와 RMSE를 이용하였다. 그 결과, 모형의 학습 평가에 대한 MSE는 11.629.8 m3/s에서 118.6 m3/s로, MAE는 25.4 m3/s에서 4.7 m3/s로 감소하였으며, 모형의 검정에 대한 MSE는 1,997.9 m3/s에서 527.9 m3/s로, MAE는 21.5 m3/s에서 9.4 m3/s로 감소한 것으로 나타났다. 또한, 모형평가를 위한 NSE는 0.7, RMSE는 27.0 m3/s로 나타나, 본 연구의 모형은 양호(moderate)한 것으로 판단하였다. 이에, 본 연구를 통해 제시된 방법론에 기반을 두어 CNN 모형 구조의 확장과 수문학적 이미지의 개선 또는 새로운 이미지 개발 등을 추진할 경우 모형의 예측 성능이 향상될 수 있는 여지가 있으며, 원격탐사 분야나, 위성 영상을 이용한 전 지구적 또는 광역 단위의 실시간 유량 모의 분야 등으로의 응용이 가능할 것으로 기대된다.

  • PDF

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.

A Study on Leakage Detection Technique Using Transfer Learning-Based Feature Fusion (전이학습 기반 특징융합을 이용한 누출판별 기법 연구)

  • YuJin Han;Tae-Jin Park;Jonghyuk Lee;Ji-Hoon Bae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.2
    • /
    • pp.41-47
    • /
    • 2024
  • When there were disparities in performance between models trained in the time and frequency domains, even after conducting an ensemble, we observed that the performance of the ensemble was compromised due to imbalances in the individual model performances. Therefore, this paper proposes a leakage detection technique to enhance the accuracy of pipeline leakage detection through a step-wise learning approach that extracts features from both the time and frequency domains and integrates them. This method involves a two-step learning process. In the Stage 1, independent model training is conducted in the time and frequency domains to effectively extract crucial features from the provided data in each domain. In Stage 2, the pre-trained models were utilized by removing their respective classifiers. Subsequently, the features from both domains were fused, and a new classifier was added for retraining. The proposed transfer learning-based feature fusion technique in this paper performs model training by integrating features extracted from the time and frequency domains. This integration exploits the complementary nature of features from both domains, allowing the model to leverage diverse information. As a result, it achieved a high accuracy of 99.88%, demonstrating outstanding performance in pipeline leakage detection.