• Title/Summary/Keyword: Recognition Improvement

Search Result 1,513, Processing Time 0.029 seconds

Effective Recognition of Velopharyngeal Insufficiency (VPI) Patient's Speech Using DNN-HMM-based System (DNN-HMM 기반 시스템을 이용한 효과적인 구개인두부전증 환자 음성 인식)

  • Yoon, Ki-mu;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.1
    • /
    • pp.33-38
    • /
    • 2019
  • This paper proposes an effective recognition method of VPI patient's speech employing DNN-HMM-based speech recognition system, and evaluates the recognition performance compared to GMM-HMM-based system. The proposed method employs speaker adaptation technique to improve VPI speech recognition. This paper proposes to use simulated VPI speech for generating a prior model for speaker adaptation and selective learning of weight matrices of DNN, in order to effectively utilize the small size of VPI speech for model adaptation. We also apply Linear Input Network (LIN) based model adaptation technique for the DNN model. The proposed speaker adaptation method brings 2.35% improvement in average accuracy compared to GMM-HMM based ASR system. The experimental results demonstrate that the proposed DNN-HMM-based speech recognition system is effective for VPI speech with small-sized speech data, compared to conventional GMM-HMM system.

A Study on Overcoming Disturbance Light using Polarization Filter and Performance Improvement of Face Recognition System

  • Yoon, Andy Kyung-yong;Park, Ki-cheul;Lee, Byeong-cheol;Jang, Jung-hyuk
    • Journal of Multimedia Information System
    • /
    • v.7 no.4
    • /
    • pp.239-248
    • /
    • 2020
  • The performance of the facial recognition system is determined by many technical factors. Further, most of the technical factors have been realized or are still in continued research. The recognition rate has a great influence on performance not only by technical factors but also by other factors. However, researchers are trying to improve the recognition rate by focusing only on technical factors. The mechanism of recognizing is to compare a face image obtained by photography to an already stored face image and determine the score of the similarity. However, if the photographed image is damaged by external light, even a system with a good algorithm will fail to recognize it. Therefore, it is important to prevent the disturbance of light entering from the outside, so it should be blocked, but the camera will not work without light. Thus, it is proposed that a method to secure the external light but block the disturbance of light that affects photography. A method of blocking disturbance light is to use a polarization filter. There are three polarization methods: circular polarization, linear polarization, and elliptical polarization. In this paper, an experiment was performed to overcome disturbance of light using only a circularly polarized filter. In addition, a lighting system that reproduces disturbance light was provided for the experiment, and light of varying intensities and angles was installed to affect the face recognition camera. As a result of actual application, it was determined that a very improved recognition performance in various disturbance light environments.

Implementation of Pre-Post Process for Accuraty Improvement of OCR Recognition Engine Based on Deep-Learning Technology (딥러닝 기반 OCR 인식 엔진의 정확도 향상을 위한 전/후처리기 기술 구현)

  • Jang, Chang-Bok;Kim, Ki-Bong
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.1
    • /
    • pp.163-170
    • /
    • 2022
  • With the advent of the 4th Industrial Revolution, solutions that apply AI technology are being actively developed. Since 2017, the introduction of business automation solutions using AI-based Robotic Process Automation (RPA) has begun in the financial sector and insurance companies, and recently, it is entering a time when it spreads past the stage of introducing RPA solutions. Among the business automation using these RPA solutions, it is very important how accurately textual information in the document is recognized for business automation using various documents. Such character recognition has recently increased its accuracy by introducing deep learning technology, but there is still no recognition model with perfect recognition accuracy. Therefore, in this paper, we checked how much accuracy is improved when pre- and post-processor technologies are applied to deep learning-based character recognition engines, and implemented RPA recognition engines and linkage technologies.

High-Frequency Interchange Network for Multispectral Object Detection (다중 스펙트럼 객체 감지를 위한 고주파 교환 네트워크)

  • Park, Seon-Hoo;Yun, Jun-Seok;Yoo, Seok Bong;Han, Seunghwoi
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.8
    • /
    • pp.1121-1129
    • /
    • 2022
  • Object recognition is carried out using RGB images in various object recognition studies. However, RGB images in dark illumination environments or environments where target objects are occluded other objects cause poor object recognition performance. On the other hand, IR images provide strong object recognition performance in these environments because it detects infrared waves rather than visible illumination. In this paper, we propose an RGB-IR fusion model, high-frequency interchange network (HINet), which improves object recognition performance by combining only the strengths of RGB-IR image pairs. HINet connected two object detection models using a mutual high-frequency transfer (MHT) to interchange advantages between RGB-IR images. MHT converts each pair of RGB-IR images into a discrete cosine transform (DCT) spectrum domain to extract high-frequency information. The extracted high-frequency information is transmitted to each other's networks and utilized to improve object recognition performance. Experimental results show the superiority of the proposed network and present performance improvement of the multispectral object recognition task.

Re-Education Situation and Problem Point of Beauty Artist (미용종사자의 재교육 실태조사 및 문제점)

  • Jang, Young-Hye;Yoo, Tai-Soon
    • Fashion & Textile Research Journal
    • /
    • v.7 no.2
    • /
    • pp.231-236
    • /
    • 2005
  • The purposes of this study were, a more systemic and desirable improvement method of retraining program was researched for cultivation of field beauticians through acknowledgement of the environmental change and phenomenon of beauty shops by time and through recognition of current retraining situation. 1) In terms of acknowledgement of the necessity for beautician retraining, the item on the necessity for beautician retraining showed beauticians' high requirement for retraining. 2) The main problem of reeducation was that reeducation curricular of each organization have not been programed. 3) Presence education was the main thing to be completed for reeducation program improvement. We also could know that the improvement and complement of work environment, the education extension besides technology education, and the rapid acquisition of information as improvement were recognized importantly.

A Genetic Algorithm-based Classifier Ensemble Optimization for Activity Recognition in Smart Homes

  • Fatima, Iram;Fahim, Muhammad;Lee, Young-Koo;Lee, Sungyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.11
    • /
    • pp.2853-2873
    • /
    • 2013
  • Over the last few years, one of the most common purposes of smart homes is to provide human centric services in the domain of u-healthcare by analyzing inhabitants' daily living. Currently, the major challenges in activity recognition include the reliability of prediction of each classifier as they differ according to smart homes characteristics. Smart homes indicate variation in terms of performed activities, deployed sensors, environment settings, and inhabitants' characteristics. It is not possible that one classifier always performs better than all the other classifiers for every possible situation. This observation has motivated towards combining multiple classifiers to take advantage of their complementary performance for high accuracy. Therefore, in this paper, a method for activity recognition is proposed by optimizing the output of multiple classifiers with Genetic Algorithm (GA). Our proposed method combines the measurement level output of different classifiers for each activity class to make up the ensemble. For the evaluation of the proposed method, experiments are performed on three real datasets from CASAS smart home. The results show that our method systematically outperforms single classifier and traditional multiclass models. The significant improvement is achieved from 0.82 to 0.90 in the F-measures of recognized activities as compare to existing methods.

Musical Instrument Recognition for the Categorization of UCC Music Source (UCC 음원분류를 위한 연주악기 분류에 대한 연구)

  • Kwon, Soon-Il;Park, Wan-Joo
    • The KIPS Transactions:PartB
    • /
    • v.17B no.2
    • /
    • pp.107-114
    • /
    • 2010
  • A guitar, a piano, and a violin are popular musical instruments for User Created Contents(UCC). However the patterns of audio signal generated by a guitar and a piano are too similar to differentiate. The difference between two musical instruments can be found by analyzing the frequency variation per each band near signal peaks. The distribution of probability on the existence of signal peaks based on Cumulative Histogram were applied to musical instrument recognition. Experiments with statistical models of the frequency variation per each band near signal peaks showed the 14% improvement of musical instrument recognition.

Robust Multi-Layer Hierarchical Model for Digit Character Recognition

  • Yang, Jie;Sun, Yadong;Zhang, Liangjun;Zhang, Qingnian
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.2
    • /
    • pp.699-707
    • /
    • 2015
  • Although digit character recognition has got a significant improvement in recent years, it is still challenging to achieve satisfied result if the data contains an amount of distracting factors. This paper proposes a novel digit character recognition approach using a multi-layer hierarchical model, Hybrid Restricted Boltzmann Machines (HRBMs), which allows the learning architecture to be robust to background distracting factors. The insight behind the proposed model is that useful high-level features appear more frequently than distracting factors during learning, thus the high-level features can be decompose into hybrid hierarchical structures by using only small label information. In order to extract robust and compact features, a stochastic 0-1 layer is employed, which enables the model's hidden nodes to independently capture the useful character features during training. Experiments on the variations of Mixed National Institute of Standards and Technology (MNIST) dataset show that improvements of the multi-layer hierarchical model can be achieved by the proposed method. Finally, the paper shows the proposed technique which is used in a real-world application, where it is able to identify digit characters under various complex background images.

Spatial-temporal Ensemble Method for Action Recognition (행동 인식을 위한 시공간 앙상블 기법)

  • Seo, Minseok;Lee, Sangwoo;Choi, Dong-Geol
    • The Journal of Korea Robotics Society
    • /
    • v.15 no.4
    • /
    • pp.385-391
    • /
    • 2020
  • As deep learning technology has been developed and applied to various fields, it is gradually changing from an existing single image based application to a video based application having a time base in order to recognize human behavior. However, unlike 2D CNN in a single image, 3D CNN in a video has a very high amount of computation and parameter increase due to the addition of a time axis, so improving accuracy in action recognition technology is more difficult than in a single image. To solve this problem, we investigate and analyze various techniques to improve performance in 3D CNN-based image recognition without additional training time and parameter increase. We propose a time base ensemble using the time axis that exists only in the videos and an ensemble in the input frame. We have achieved an accuracy improvement of up to 7.1% compared to the existing performance with a combination of techniques. It also revealed the trade-off relationship between computational and accuracy.

Improvement of Accuracy for Human Action Recognition by Histogram of Changing Points and Average Speed Descriptors

  • Vu, Thi Ly;Do, Trung Dung;Jin, Cheng-Bin;Li, Shengzhe;Nguyen, Van Huan;Kim, Hakil;Lee, Chongho
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.1
    • /
    • pp.29-38
    • /
    • 2015
  • Human action recognition has become an important research topic in computer vision area recently due to many applications in the real world, such as video surveillance, video retrieval, video analysis, and human-computer interaction. The goal of this paper is to evaluate descriptors which have recently been used in action recognition, namely Histogram of Oriented Gradient (HOG) and Histogram of Optical Flow (HOF). This paper also proposes new descriptors to represent the change of points within each part of a human body, caused by actions named as Histogram of Changing Points (HCP) and so-called Average Speed (AS) which measures the average speed of actions. The descriptors are combined to build a strong descriptor to represent human actions by modeling the information about appearance, local motion, and changes on each part of the body, as well as motion speed. The effectiveness of these new descriptors is evaluated in the experiments on KTH and Hollywood datasets.