• Title/Summary/Keyword: end-to-end learning

Search Result 1,132, Processing Time 0.024 seconds

Character Recognition Algorithm in Low-Quality Legacy Contents Based on Alternative End-to-End Learning (대안적 통째학습 기반 저품질 레거시 콘텐츠에서의 문자 인식 알고리즘)

  • Lee, Sung-Jin;Yun, Jun-Seok;Park, Seon-hoo;Yoo, Seok Bong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.11
    • /
    • pp.1486-1494
    • /
    • 2021
  • Character recognition is a technology required in various platforms, such as smart parking and text to speech, and many studies are being conducted to improve its performance through new attempts. However, with low-quality image used for character recognition, a difference in resolution of the training image and test image for character recognition occurs, resulting in poor accuracy. To solve this problem, this paper designed an end-to-end learning neural network that combines image super-resolution and character recognition so that the character recognition model performance is robust against various quality data, and implemented an alternative whole learning algorithm to learn the whole neural network. An alternative end-to-end learning and recognition performance test was conducted using the license plate image among various text images, and the effectiveness of the proposed algorithm was verified with the performance test.

On the Reward Function of Latent SAC Reinforcement Learning to Improve Longitudinal Driving Performance (종방향 주행성능향상을 위한 Latent SAC 강화학습 보상함수 설계)

  • Jo, Sung-Bean;Jeong, Han-You
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.728-734
    • /
    • 2021
  • In recent years, there has been a strong interest in the end-to-end autonomous driving based on deep reinforcement learning. In this paper, we present a reward function of latent SAC deep reinforcement learning to improve the longitudinal driving performance of an agent vehicle. While the existing reward function significantly degrades the driving safety and efficiency, the proposed reward function is shown to maintain an appropriate headway distance while avoiding the front vehicle collision.

Development of a Low-cost Industrial OCR System with an End-to-end Deep Learning Technology

  • Subedi, Bharat;Yunusov, Jahongir;Gaybulayev, Abdulaziz;Kim, Tae-Hyong
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.2
    • /
    • pp.51-60
    • /
    • 2020
  • Optical character recognition (OCR) has been studied for decades because it is very useful in a variety of places. Nowadays, OCR's performance has improved significantly due to outstanding deep learning technology. Thus, there is an increasing demand for commercial-grade but affordable OCR systems. We have developed a low-cost, high-performance OCR system for the industry with the cheapest embedded developer kit that supports GPU acceleration. To achieve high accuracy for industrial use on limited computing resources, we chose a state-of-the-art text recognition algorithm that uses an end-to-end deep learning network as a baseline model. The model was then improved by replacing the feature extraction network with the best one suited to our conditions. Among the various candidate networks, EfficientNet-B3 has shown the best performance: excellent recognition accuracy with relatively low memory consumption. Besides, we have optimized the model written in TensorFlow's Python API using TensorFlow-TensorRT integration and TensorFlow's C++ API, respectively.

Trends in Deep-neural-network-based Dialogue Systems (심층 신경망 기반 대화처리 기술 동향)

  • Kwon, O.W.;Hong, T.G.;Huang, J.X.;Roh, Y.H.;Choi, S.K.;Kim, H.Y.;Kim, Y.K.;Lee, Y.K.
    • Electronics and Telecommunications Trends
    • /
    • v.34 no.4
    • /
    • pp.55-64
    • /
    • 2019
  • In this study, we introduce trends in neural-network-based deep learning research applied to dialogue systems. Recently, end-to-end trainable goal-oriented dialogue systems using long short-term memory, sequence-to-sequence models, among others, have been studied to overcome the difficulties of domain adaptation and error recognition and recovery in traditional pipeline goal-oriented dialogue systems. In addition, some research has been conducted on applying reinforcement learning to end-to-end trainable goal-oriented dialogue systems to learn dialogue strategies that do not appear in training corpora. Recent neural network models for end-to-end trainable chit-chat systems have been improved using dialogue context as well as personal and topic information to produce a more natural human conversation. Unlike previous studies that have applied different approaches to goal-oriented dialogue systems and chit-chat systems respectively, recent studies have attempted to apply end-to-end trainable approaches based on deep neural networks in common to them. Acquiring dialogue corpora for training is now necessary. Therefore, future research will focus on easily and cheaply acquiring dialogue corpora and training with small annotated dialogue corpora and/or large raw dialogues.

Korean speech recognition using deep learning (딥러닝 모형을 사용한 한국어 음성인식)

  • Lee, Suji;Han, Seokjin;Park, Sewon;Lee, Kyeongwon;Lee, Jaeyong
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.213-227
    • /
    • 2019
  • In this paper, we propose an end-to-end deep learning model combining Bayesian neural network with Korean speech recognition. In the past, Korean speech recognition was a complicated task due to the excessive parameters of many intermediate steps and needs for Korean expertise knowledge. Fortunately, Korean speech recognition becomes manageable with the aid of recent breakthroughs in "End-to-end" model. The end-to-end model decodes mel-frequency cepstral coefficients directly as text without any intermediate processes. Especially, Connectionist Temporal Classification loss and Attention based model are a kind of the end-to-end. In addition, we combine Bayesian neural network to implement the end-to-end model and obtain Monte Carlo estimates. Finally, we carry out our experiments on the "WorimalSam" online dictionary dataset. We obtain 4.58% Word Error Rate showing improved results compared to Google and Naver API.

End-to-end speech recognition models using limited training data (제한된 학습 데이터를 사용하는 End-to-End 음성 인식 모델)

  • Kim, June-Woo;Jung, Ho-Young
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.63-71
    • /
    • 2020
  • Speech recognition is one of the areas actively commercialized using deep learning and machine learning techniques. However, the majority of speech recognition systems on the market are developed on data with limited diversity of speakers and tend to perform well on typical adult speakers only. This is because most of the speech recognition models are generally learned using a speech database obtained from adult males and females. This tends to cause problems in recognizing the speech of the elderly, children and people with dialects well. To solve these problems, it may be necessary to retain big database or to collect a data for applying a speaker adaptation. However, this paper proposes that a new end-to-end speech recognition method consists of an acoustic augmented recurrent encoder and a transformer decoder with linguistic prediction. The proposed method can bring about the reliable performance of acoustic and language models in limited data conditions. The proposed method was evaluated to recognize Korean elderly and children speech with limited amount of training data and showed the better performance compared of a conventional method.

A Study on the Effect of Learning Activities and Feedback Seeking Behavior toward the End Users' Faithful Appropriation of Information Security System (조직내 최종사용자의 합목적적인 정보보호 시스템 사용 내재화와 학습, 피드백 추구 행동 연구)

  • Kim, Min Woong;Cheong, Ki Ju
    • The Journal of Information Systems
    • /
    • v.25 no.3
    • /
    • pp.117-146
    • /
    • 2016
  • Purpose The purpose of this paper is to examine factors and mechanism inducing end users' faithful appropriation of information security behavior through the information security system. This study is also trying to find out the role of Employees' adaptive activities like learning and feedback seeking behavior for the information security in organizations. Design/methodology/approach An empirical study was carried out with a sample of employees working in the financial service company. Employees(n = 268) completed a written questionnaire. Structural equation modeling was used to analyze the data. Findings Results indicated that employees' learning activities and feedback seeking behavior fully mediated the effect of major information security factors toward end users' faithfulness of appropriation of information security systems. In order to increase the level of employees information security behavior in accordance with security guideline, organizations should facilitate interactions that support the feedback seeking process between employees on information security awareness and behavior. Additionally, organizations may reinforce these behaviors by periodical training and adopting bounty hunter systems.

End-to-end-based Wi-Fi RTT network structure design for positioning stabilization (측위 안정화를 위한 End to End 기반의 Wi-Fi RTT 네트워크 구조 설계)

  • Seong, Ju-Hyeon
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.5
    • /
    • pp.676-683
    • /
    • 2021
  • Wi-Fi Round-trip timing (RTT) based location estimation technology estimates the distance between the user and the AP based on the transmission and reception time of the signal. This is because reception instability and signal distortion are greater than that of a Received Signal Strength Indicator (RSSI) based fingerprint in an indoor NLOS environment, resulting in a large position error due to multipath fading. To solve this problem, in this paper, we propose an end-to-end based WiFi Trilateration Net (WTN) that combines neural network-based RTT correction and trilateral positioning network, respectively. The proposed WTN is composed of an RNN-based correction network to improve the RTT distance accuracy and a neural network-based trilateral positioning network for real-time positioning implemented in an end-to-end structure. The proposed network improves learning efficiency by changing the trilateral positioning algorithm, which cannot be learned through differentiation due to mathematical operations, to a neural network. In addition, in order to increase the stability of the TOA based RTT, a correction network is applied in the scanning step to collect reliable distance estimation values from each RTT AP.

Interference Cancellation Scheme of End-to-End Method in Power Line Communication System for Smart Grid (스마트 그리드 시스템을 위한 전력선 통신 시스템의 종단 간 방식의 간섭 제거 기법)

  • Seo, Sung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.2
    • /
    • pp.41-45
    • /
    • 2019
  • In this paper, we propose the interference cancellation scheme of end-to-end method algorithm for power line communication (PLC) systems in smart grid. The proposed scheme estimates the channel noise information of receiver by applying a deep learning model at the receiver. Then, the estimated channel noise is updated in database. In the modulator, the channel noise which reduces the power line communication performance is effectively removed through interference cancellation technique. As an impulsive noise model, Middleton Class A interference model was employed. The performance is evaluated in terms of bit error rate (BER). From the simulation results, it is confirmed that the proposed scheme has better BER performance compared to the theoretical model based on additive white Gaussian noise. As a result, the proposed interference cancellation with deep learning improves the signal quality of PLC systems by effectively removing the channel noise. The results of the paper can be applied to PLC for smart grid and general communication systems.

A Study on Combine Artificial Intelligence Models for multi-classification for an Abnormal Behaviors in CCTV images (CCTV 영상의 이상행동 다중 분류를 위한 결합 인공지능 모델에 관한 연구)

  • Lee, Hongrae;Kim, Youngtae;Seo, Byung-suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.498-500
    • /
    • 2022
  • CCTV protects people and assets safely by identifying dangerous situations and responding promptly. However, it is difficult to continuously monitor the increasing number of CCTV images. For this reason, there is a need for a device that continuously monitors CCTV images and notifies when abnormal behavior occurs. Recently, many studies using artificial intelligence models for image data analysis have been conducted. This study simultaneously learns spatial and temporal characteristic information between image data to classify various abnormal behaviors that can be observed in CCTV images. As an artificial intelligence model used for learning, we propose a multi-classification deep learning model that combines an end-to-end 3D convolutional neural network(CNN) and ResNet.

  • PDF