• Title/Summary/Keyword: Deep-Neural-Network

Search Result 2,092, Processing Time 0.027 seconds

Super High-Resolution Image Style Transfer (초-고해상도 영상 스타일 전이)

  • Kim, Yong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.104-123
    • /
    • 2022
  • Style transfer based on neural network provides very high quality results by reflecting the high level structural characteristics of images, and thereby has recently attracted great attention. This paper deals with the problem of resolution limitation due to GPU memory in performing such neural style transfer. We can expect that the gradient operation for style transfer based on partial image, with the aid of the fixed size of receptive field, can produce the same result as the gradient operation using the entire image. Based on this idea, each component of the style transfer loss function is analyzed in this paper to obtain the necessary conditions for partitioning and padding, and to identify, among the information required for gradient calculation, the one that depends on the entire input. By structuring such information for using it as auxiliary constant input for partition-based gradient calculation, this paper develops a recursive algorithm for super high-resolution image style transfer. Since the proposed method performs style transfer by partitioning input image into the size that a GPU can handle, it can perform style transfer without the limit of the input image resolution accompanied by the GPU memory size. With the aid of such super high-resolution support, the proposed method can provide a unique style characteristics of detailed area which can only be appreciated in super high-resolution style transfer.

A Study about Learning Graph Representation on Farmhouse Apple Quality Images with Graph Transformer (그래프 트랜스포머 기반 농가 사과 품질 이미지의 그래프 표현 학습 연구)

  • Ji Hun Bae;Ju Hwan Lee;Gwang Hyun Yu;Gyeong Ju Kwon;Jin Young Kim
    • Smart Media Journal
    • /
    • v.12 no.1
    • /
    • pp.9-16
    • /
    • 2023
  • Recently, a convolutional neural network (CNN) based system is being developed to overcome the limitations of human resources in the apple quality classification of farmhouse. However, since convolutional neural networks receive only images of the same size, preprocessing such as sampling may be required, and in the case of oversampling, information loss of the original image such as image quality degradation and blurring occurs. In this paper, in order to minimize the above problem, to generate a image patch based graph of an original image and propose a random walk-based positional encoding method to apply the graph transformer model. The above method continuously learns the position embedding information of patches which don't have a positional information based on the random walk algorithm, and finds the optimal graph structure by aggregating useful node information through the self-attention technique of graph transformer model. Therefore, it is robust and shows good performance even in a new graph structure of random node order and an arbitrary graph structure according to the location of an object in an image. As a result, when experimented with 5 apple quality datasets, the learning accuracy was higher than other GNN models by a minimum of 1.3% to a maximum of 4.7%, and the number of parameters was 3.59M, which was about 15% less than the 23.52M of the ResNet18 model. Therefore, it shows fast reasoning speed according to the reduction of the amount of computation and proves the effect.

A design of Optimized Vehicle Routing System(OVRS) based on RSU communication and deep learning (RSU 통신 및 딥러닝 기반 최적화 차량 라우팅 시스템 설계)

  • Son, Su-Rak;Lee, Byung-Kwan;Sim, Son-Kweon;Jeong, Yi-Na
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.2
    • /
    • pp.129-137
    • /
    • 2020
  • Currently, The autonomous vehicle market is researching and developing four-level autonomous vehicles beyond the commercialization of three-level autonomous vehicles. Because unlike the level 3, the level 4 autonomous vehicle has to deal with an emergency directly, the most important aspect of a four-level autonomous vehicle is its stability. In this paper, we propose an Optimized Vehicle Routing System (OVRS) that determines the route with the lowest probability of an accident at the destination of the vehicle rather than an immediate response in an emergency. The OVRS analyzes road and surrounding vehicle information collected by The RSU communication to predict road hazards, and sets the route for the safer and faster road. The OVRS can improve the stability of the vehicle by executing the route guidance according to the road situation through the RSU on the road like the network routing method. As a result, the RPNN of the ASICM, one of the OVRS modules, was about 17% better than the CNN and 40% better than the LSTM. However, because the study was conducted in a virtual environment using a PC, the possibility of accident of the VPDM was not actually verified. Therefore, in the future, experiments with high accuracy on VPDM due to the collection of accident data and actual roads should be conducted in real vehicles and RSUs.

Automated Vehicle Research by Recognizing Maneuvering Modes using LSTM Model (LSTM 모델 기반 주행 모드 인식을 통한 자율 주행에 관한 연구)

  • Kim, Eunhui;Oh, Alice
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.4
    • /
    • pp.153-163
    • /
    • 2017
  • This research is based on the previous research that personally preferred safe distance, rotating angle and speed are differentiated. Thus, we use machine learning model for recognizing maneuvering modes trained per personal or per similar driving pattern groups, and we evaluate automatic driving according to maneuvering modes. By utilizing driving knowledge, we subdivided 8 kinds of longitudinal modes and 4 kinds of lateral modes, and by combining the longitudinal and lateral modes, we build 21 kinds of maneuvering modes. we train the labeled data set per time stamp through RNN, LSTM and Bi-LSTM models by the trips of drivers, which are supervised deep learning models, and evaluate the maneuvering modes of automatic driving for the test data set. The evaluation dataset is aggregated of living trips of 3,000 populations by VTTI in USA for 3 years and we use 1500 trips of 22 people and training, validation and test dataset ratio is 80%, 10% and 10%, respectively. For recognizing longitudinal 8 kinds of maneuvering modes, RNN achieves better accuracy compared to LSTM, Bi-LSTM. However, Bi-LSTM improves the accuracy in recognizing 21 kinds of longitudinal and lateral maneuvering modes in comparison with RNN and LSTM as 1.54% and 0.47%, respectively.

Semi-supervised domain adaptation using unlabeled data for end-to-end speech recognition (라벨이 없는 데이터를 사용한 종단간 음성인식기의 준교사 방식 도메인 적응)

  • Jeong, Hyeonjae;Goo, Jahyun;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.29-37
    • /
    • 2020
  • Recently, the neural network-based deep learning algorithm has dramatically improved performance compared to the classical Gaussian mixture model based hidden Markov model (GMM-HMM) automatic speech recognition (ASR) system. In addition, researches on end-to-end (E2E) speech recognition systems integrating language modeling and decoding processes have been actively conducted to better utilize the advantages of deep learning techniques. In general, E2E ASR systems consist of multiple layers of encoder-decoder structure with attention. Therefore, E2E ASR systems require data with a large amount of speech-text paired data in order to achieve good performance. Obtaining speech-text paired data requires a lot of human labor and time, and is a high barrier to building E2E ASR system. Therefore, there are previous studies that improve the performance of E2E ASR system using relatively small amount of speech-text paired data, but most studies have been conducted by using only speech-only data or text-only data. In this study, we proposed a semi-supervised training method that enables E2E ASR system to perform well in corpus in different domains by using both speech or text only data. The proposed method works effectively by adapting to different domains, showing good performance in the target domain and not degrading much in the source domain.

Motion Monitoring using Mask R-CNN for Articulation Disease Management (관절질환 관리를 위한 Mask R-CNN을 이용한 모션 모니터링)

  • Park, Sung-Soo;Baek, Ji-Won;Jo, Sun-Moon;Chung, Kyungyong
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.3
    • /
    • pp.1-6
    • /
    • 2019
  • In modern society, lifestyle and individuality are important, and personalized lifestyle and patterns are emerging. The number of people with articulation diseases is increasing due to wrong living habits. In addition, as the number of households increases, there is a case where emergency care is not received at the appropriate time. We need information that can be managed by ourselves through accurate analysis according to the individual's condition for health and disease management, and care appropriate to the emergency situation. It is effectively used for classification and prediction of data using CNN in deep learning. CNN differs in accuracy and processing time according to the data features. Therefore, it is necessary to improve processing speed and accuracy for real-time healthcare. In this paper, we propose motion monitoring using Mask R-CNN for articulation disease management. The proposed method uses Mask R-CNN which is superior in accuracy and processing time than CNN. After the user's motion is learned in the neural network, if the user's motion is different from the learned data, the control method can be fed back to the user, the emergency situation can be informed to the guardian, and appropriate methods can be taken according to the situation.

Estimation of KOSPI200 Index option volatility using Artificial Intelligence (이기종 머신러닝기법을 활용한 KOSPI200 옵션변동성 예측)

  • Shin, Sohee;Oh, Hayoung;Kim, Jang Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.10
    • /
    • pp.1423-1431
    • /
    • 2022
  • Volatility is one of the variables that the Black-Scholes model requires for option pricing. It is an unknown variable at the present time, however, since the option price can be observed in the market, implied volatility can be derived from the price of an option at any given point in time and can represent the market's expectation of future volatility. Although volatility in the Black-Scholes model is constant, when calculating implied volatility, it is common to observe a volatility smile which shows that the implied volatility is different depending on the strike prices. We implement supervised learning to target implied volatility by adding V-KOSPI to ease volatility smile. We examine the estimation performance of KOSPI200 index options' implied volatility using various Machine Learning algorithms such as Linear Regression, Tree, Support Vector Machine, KNN and Deep Neural Network. The training accuracy was the highest(99.9%) in Decision Tree model and test accuracy was the highest(96.9%) in Random Forest model.

Construction Method of ECVAM using Land Cover Map and KOMPSAT-3A Image (토지피복지도와 KOMPSAT-3A위성영상을 활용한 환경성평가지도의 구축)

  • Kwon, Hee Sung;Song, Ah Ram;Jung, Se Jung;Lee, Won Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.5
    • /
    • pp.367-380
    • /
    • 2022
  • In this study, the periodic and simplified update and production way of the ECVAM (Environmental Conservation Value Assessment Map) was presented through the classification of environmental values using KOMPSAT-3A satellite imagery and land cover map. ECVAM is a map that evaluates the environmental value of the country in five stages based on 62 legal evaluation items and 8 environmental and ecological evaluation items, and is provided on two scales: 1:25000 and 1:5000. However, the 1:5000 scale environmental assessment map is being produced and serviced with a slow renewal cycle of one year due to various constraints such as the absence of reference materials and different production years. Therefore, in this study, one of the deep learning techniques, KOMPSAT-3A satellite image, SI (Spectral Indices), and land cover map were used to conduct this study to confirm the possibility of establishing an environmental assessment map. As a result, the accuracy was calculated to be 87.25% and 85.88%, respectively. Through the results of the study, it was possible to confirm the possibility of constructing an environmental assessment map using satellite imagery, optical index, and land cover classification.

Assessment of the Object Detection Ability of Interproximal Caries on Primary Teeth in Periapical Radiographs Using Deep Learning Algorithms (유치의 치근단 방사선 사진에서 딥 러닝 알고리즘을 이용한 모델의 인접면 우식증 객체 탐지 능력의 평가)

  • Hongju Jeon;Seonmi Kim;Namki Choi
    • Journal of the korean academy of Pediatric Dentistry
    • /
    • v.50 no.3
    • /
    • pp.263-276
    • /
    • 2023
  • The purpose of this study was to evaluate the performance of a model using You Only Look Once (YOLO) for object detection of proximal caries in periapical radiographs of children. A total of 2016 periapical radiographs in primary dentition were selected from the M6 database as a learning material group, of which 1143 were labeled as proximal caries by an experienced dentist using an annotation tool. After converting the annotations into a training dataset, YOLO was trained on the dataset using a single convolutional neural network (CNN) model. Accuracy, recall, specificity, precision, negative predictive value (NPV), F1-score, Precision-Recall curve, and AP (area under curve) were calculated for evaluation of the object detection model's performance in the 187 test datasets. The results showed that the CNN-based object detection model performed well in detecting proximal caries, with a diagnostic accuracy of 0.95, a recall of 0.94, a specificity of 0.97, a precision of 0.82, a NPV of 0.96, and an F1-score of 0.81. The AP was 0.83. This model could be a valuable tool for dentists in detecting carious lesions in periapical radiographs.

Semantic Segmentation of Hazardous Facilities in Rural Area Using U-Net from KOMPSAT Ortho Mosaic Imagery (KOMPSAT 정사모자이크 영상으로부터 U-Net 모델을 활용한 농촌위해시설 분류)

  • Sung-Hyun Gong;Hyung-Sup Jung;Moung-Jin Lee;Kwang-Jae Lee;Kwan-Young Oh;Jae-Young Chang
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_3
    • /
    • pp.1693-1705
    • /
    • 2023
  • Rural areas, which account for about 90% of the country's land area, are increasing in importance and value as a space that performs various public functions. However, facilities that adversely affect residents' lives, such as livestock facilities, factories, and solar panels, are being built indiscriminately near residential areas, damaging the rural environment and landscape and lowering the quality of residents' lives. In order to prevent disorderly development in rural areas and manage rural space in a planned manner, detection and monitoring of hazardous facilities in rural areas is necessary. Data can be acquired through satellite imagery, which can be acquired periodically and provide information on the entire region. Effective detection is possible by utilizing image-based deep learning techniques using convolutional neural networks. Therefore, U-Net model, which shows high performance in semantic segmentation, was used to classify potentially hazardous facilities in rural areas. In this study, KOMPSAT ortho-mosaic optical imagery provided by the Korea Aerospace Research Institute in 2020 with a spatial resolution of 0.7 meters was used, and AI training data for livestock facilities, factories, and solar panels were produced by hand for training and inference. After training with U-Net, pixel accuracy of 0.9739 and mean Intersection over Union (mIoU) of 0.7025 were achieved. The results of this study can be used for monitoring hazardous facilities in rural areas and are expected to be used as basis for rural planning.