• Title/Summary/Keyword: deep learning models

Search Result 1,393, Processing Time 0.024 seconds

An Artificial Intelligence Method for the Prediction of Near- and Off-Shore Fish Catch Using Satellite and Numerical Model Data

  • Yoon, You-Jeong;Cho, Subin;Kim, Seoyeon;Kim, Nari;Lee, Soo-Jin;Ahn, Jihye;Lee, Eunjeong;Joh, Seongeok;Lee, Yang-Won
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.1
    • /
    • pp.41-53
    • /
    • 2020
  • The production of near- and off-shore fisheries in South Korea is decreasing due to rapid changes in the fishing environment, particularly including higher sea temperature in recent years. To improve the competitiveness of the fisheries, it is necessary to provide fish catch information that changes spatiotemporally according to the sea state. In this study, artificial intelligence models that predict the CPUE (catch per unit effort) of mackerel, anchovies, and squid (Todarodes pacificus), which are three major fish species in the near- and off-shore areas of South Korea, on a 15-km grid and daily basis were developed. The models were trained and validated using the sea surface temperature, rainfall, relative humidity, pressure,sea surface wind velocity, significant wave height, and salinity as input data, and the fish catch statistics of Suhyup (National Federation of Fisheries Cooperatives) as observed data. The 10-fold blind test results showed that the developed artificial intelligence models exhibited accuracy with a corresponding correlation coefficient of 0.86. It is expected that the fish catch models can be actually operated with high accuracy under various sea conditions if high-quality large-volume data are available.

Comparison Analysis of Four Face Swapping Models for Interactive Media Platform COX (인터랙티브 미디어 플랫폼 콕스에 제공될 4가지 얼굴 변형 기술의 비교분석)

  • Jeon, Ho-Beom;Ko, Hyun-kwan;Lee, Seon-Gyeong;Song, Bok-Deuk;Kim, Chae-Kyu;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.5
    • /
    • pp.535-546
    • /
    • 2019
  • Recently, there have been a lot of researches on the whole face replacement system, but it is not easy to obtain stable results due to various attitudes, angles and facial diversity. To produce a natural synthesis result when replacing the face shown in the video image, technologies such as face area detection, feature extraction, face alignment, face area segmentation, 3D attitude adjustment and facial transposition should all operate at a precise level. And each technology must be able to be interdependently combined. The results of our analysis show that the difficulty of implementing the technology and contribution to the system in facial replacement technology has increased in facial feature point extraction and facial alignment technology. On the other hand, the difficulty of the facial transposition technique and the three-dimensional posture adjustment technique were low, but showed the need for development. In this paper, we propose four facial replacement models such as 2-D Faceswap, OpenPose, Deekfake, and Cycle GAN, which are suitable for the Cox platform. These models have the following features; i.e. these models include a suitable model for front face pose image conversion, face pose image with active body movement, and face movement with right and left side by 15 degrees, Generative Adversarial Network.

Recent Automatic Post Editing Research (최신 기계번역 사후 교정 연구)

  • Moon, Hyeonseok;Park, Chanjun;Eo, Sugyeong;Seo, Jaehyung;Lim, Heuiseok
    • Journal of Digital Convergence
    • /
    • v.19 no.7
    • /
    • pp.199-208
    • /
    • 2021
  • Automatic Post Editing(APE) is the study that automatically correcting errors included in the machine translated sentences. The goal of APE task is to generate error correcting models that improve translation quality, regardless of the translation system. For training these models, source sentence, machine translation, and post edit, which is manually edited by human translator, are utilized. Especially in the recent APE research, multilingual pretrained language models are being adopted, prior to the training by APE data. This study deals with multilingual pretrained language models adopted to the latest APE researches, and the specific application method for each APE study. Furthermore, based on the current research trend, we propose future research directions utilizing translation model or mBART model.

Compression of DNN Integer Weight using Video Encoder (비디오 인코더를 통한 딥러닝 모델의 정수 가중치 압축)

  • Kim, Seunghwan;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.778-789
    • /
    • 2021
  • Recently, various lightweight methods for using Convolutional Neural Network(CNN) models in mobile devices have emerged. Weight quantization, which lowers bit precision of weights, is a lightweight method that enables a model to be used through integer calculation in a mobile environment where GPU acceleration is unable. Weight quantization has already been used in various models as a lightweight method to reduce computational complexity and model size with a small loss of accuracy. Considering the size of memory and computing speed as well as the storage size of the device and the limited network environment, this paper proposes a method of compressing integer weights after quantization using a video codec as a method. To verify the performance of the proposed method, experiments were conducted on VGG16, Resnet50, and Resnet18 models trained with ImageNet and Places365 datasets. As a result, loss of accuracy less than 2% and high compression efficiency were achieved in various models. In addition, as a result of comparison with similar compression methods, it was verified that the compression efficiency was more than doubled.

Recurrent Neural Network based Prediction System of Agricultural Photovoltaic Power Generation (영농형 태양광 발전소에서 순환신경망 기반 발전량 예측 시스템)

  • Jung, Seol-Ryung;Koh, Jin-Gwang;Lee, Sung-Keun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.5
    • /
    • pp.825-832
    • /
    • 2022
  • In this paper, we discuss the design and implementation of predictive and diagnostic models for realizing intelligent predictive models by collecting and storing the power output of agricultural photovoltaic power generation systems. Our model predicts the amount of photovoltaic power generation using RNN, LSTM, and GRU models, which are recurrent neural network techniques specialized for time series data, and compares and analyzes each model with different hyperparameters, and evaluates the performance. As a result, the MSE and RMSE indicators of all three models were very close to 0, and the R2 indicator showed performance close to 1. Through this, it can be seen that the proposed prediction model is a suitable model for predicting the amount of photovoltaic power generation, and using this prediction, it was shown that it can be utilized as an intelligent and efficient O&M function in an agricultural photovoltaic system.

Application of CCTV Image and Semantic Segmentation Model for Water Level Estimation of Irrigation Channel (관개용수로 CCTV 이미지를 이용한 CNN 딥러닝 이미지 모델 적용)

  • Kim, Kwi-Hoon;Kim, Ma-Ga;Yoon, Pu-Reun;Bang, Je-Hong;Myoung, Woo-Ho;Choi, Jin-Yong;Choi, Gyu-Hoon
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.64 no.3
    • /
    • pp.63-73
    • /
    • 2022
  • A more accurate understanding of the irrigation water supply is necessary for efficient agricultural water management. Although we measure water levels in an irrigation canal using ultrasonic water level gauges, some errors occur due to malfunctions or the surrounding environment. This study aims to apply CNN (Convolutional Neural Network) Deep-learning-based image classification and segmentation models to the irrigation canal's CCTV (Closed-Circuit Television) images. The CCTV images were acquired from the irrigation canal of the agricultural reservoir in Cheorwon-gun, Gangwon-do. We used the ResNet-50 model for the image classification model and the U-Net model for the image segmentation model. Using the Natural Breaks algorithm, we divided water level data into 2, 4, and 8 groups for image classification models. The classification models of 2, 4, and 8 groups showed the accuracy of 1.000, 0.987, and 0.634, respectively. The image segmentation model showed a Dice score of 0.998 and predicted water levels showed R2 of 0.97 and MAE (Mean Absolute Error) of 0.02 m. The image classification models can be applied to the automatic gate-controller at four divisions of water levels. Also, the image segmentation model results can be applied to the alternative measurement for ultrasonic water gauges. We expect that the results of this study can provide a more scientific and efficient approach for agricultural water management.

Deep Learning Approaches for Accurate Weed Area Assessment in Maize Fields (딥러닝 기반 옥수수 포장의 잡초 면적 평가)

  • Hyeok-jin Bak;Dongwon Kwon;Wan-Gyu Sang;Ho-young Ban;Sungyul Chang;Jae-Kyeong Baek;Yun-Ho Lee;Woo-jin Im;Myung-chul Seo;Jung-Il Cho
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.1
    • /
    • pp.17-27
    • /
    • 2023
  • Weeds are one of the factors that reduce crop yield through nutrient and photosynthetic competition. Quantification of weed density are an important part of making accurate decisions for precision weeding. In this study, we tried to quantify the density of weeds in images of maize fields taken by unmanned aerial vehicle (UAV). UAV image data collection took place in maize fields from May 17 to June 4, 2021, when maize was in its early growth stage. UAV images were labeled with pixels from maize and those without and the cropped to be used as the input data of the semantic segmentation network for the maize detection model. We trained a model to separate maize from background using the deep learning segmentation networks DeepLabV3+, U-Net, Linknet, and FPN. All four models showed pixel accuracy of 0.97, and the mIOU score was 0.76 and 0.74 in DeepLabV3+ and U-Net, higher than 0.69 for Linknet and FPN. Weed density was calculated as the difference between the green area classified as ExGR (Excess green-Excess red) and the maize area predicted by the model. Each image evaluated for weed density was recombined to quantify and visualize the distribution and density of weeds in a wide range of maize fields. We propose a method to quantify weed density for accurate weeding by effectively separating weeds, maize, and background from UAV images of maize fields.

A comparative study on the performance of Transformer-based models for Korean speech recognition (트랜스포머 기반 모델의 한국어 음성인식 성능 비교 연구)

  • Changhan Oh;Minseo Kim;Kiyoung Park;Hwajeon Song
    • Phonetics and Speech Sciences
    • /
    • v.16 no.3
    • /
    • pp.79-86
    • /
    • 2024
  • Transformer models have shown remarkable performance in extracting meaningful information from sequential input data such as text and images, and are gaining attention as end-to-end models for speech recognition. This study compared the performances of the Transformer speech recognition model and its enhanced versions, the Conformer and E-Branchformer, when applied to Korean speech recognition. Using Korean speech data from AIHub, we prepared a training set of approximately 7,500 hours and evaluated the models using the ESPnet toolkit. Additionally, we compared syllables and subwords as recognition units and analyzed the performance differences with changes in the number of tokens using Byte Pair Encoding. The results showed that the E-Branchformer achieved the best performance in Korean speech recognition and Conformer outperformed Transformer but degraded in performance for long utterances owing to cross-attention alignment errors. We aimed to determine the optimal settings by analyzing the performance changes with subword token adjustments. This study comprehensively evaluated model accuracy and processing speed to maximize the efficiency of Korean speech recognition. This is expected to contribute to the training of large-scale Korean speech recognition models and improve Conformer recognition errors. Future research should include additional experiments with diverse Korean speech datasets and enhance the recognition performance through structural improvements in the Conformer.

A Research on Network Intrusion Detection based on Discrete Preprocessing Method and Convolution Neural Network (이산화 전처리 방식 및 컨볼루션 신경망을 활용한 네트워크 침입 탐지에 대한 연구)

  • Yoo, JiHoon;Min, Byeongjun;Kim, Sangsoo;Shin, Dongil;Shin, Dongkyoo
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.29-39
    • /
    • 2021
  • As damages to individuals, private sectors, and businesses increase due to newly occurring cyber attacks, the underlying network security problem has emerged as a major problem in computer systems. Therefore, NIDS using machine learning and deep learning is being studied to improve the limitations that occur in the existing Network Intrusion Detection System. In this study, a deep learning-based NIDS model study is conducted using the Convolution Neural Network (CNN) algorithm. For the image classification-based CNN algorithm learning, a discrete algorithm for continuity variables was added in the preprocessing stage used previously, and the predicted variables were expressed in a linear relationship and converted into easy-to-interpret data. Finally, the network packet processed through the above process is mapped to a square matrix structure and converted into a pixel image. For the performance evaluation of the proposed model, NSL-KDD, a representative network packet data, was used, and accuracy, precision, recall, and f1-score were used as performance indicators. As a result of the experiment, the proposed model showed the highest performance with an accuracy of 85%, and the harmonic mean (F1-Score) of the R2L class with a small number of training samples was 71%, showing very good performance compared to other models.

Building robust Korean speech recognition model by fine-tuning large pretrained model (대형 사전훈련 모델의 파인튜닝을 통한 강건한 한국어 음성인식 모델 구축)

  • Changhan Oh;Cheongbin Kim;Kiyoung Park
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.75-82
    • /
    • 2023
  • Automatic speech recognition (ASR) has been revolutionized with deep learning-based approaches, among which self-supervised learning methods have proven to be particularly effective. In this study, we aim to enhance the performance of OpenAI's Whisper model, a multilingual ASR system on the Korean language. Whisper was pretrained on a large corpus (around 680,000 hours) of web speech data and has demonstrated strong recognition performance for major languages. However, it faces challenges in recognizing languages such as Korean, which is not major language while training. We address this issue by fine-tuning the Whisper model with an additional dataset comprising about 1,000 hours of Korean speech. We also compare its performance against a Transformer model that was trained from scratch using the same dataset. Our results indicate that fine-tuning the Whisper model significantly improved its Korean speech recognition capabilities in terms of character error rate (CER). Specifically, the performance improved with increasing model size. However, the Whisper model's performance on English deteriorated post fine-tuning, emphasizing the need for further research to develop robust multilingual models. Our study demonstrates the potential of utilizing a fine-tuned Whisper model for Korean ASR applications. Future work will focus on multilingual recognition and optimization for real-time inference.