• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.03 seconds

Lip Reading Method Using CNN for Utterance Period Detection (발화구간 검출을 위해 학습된 CNN 기반 입 모양 인식 방법)

  • Kim, Yong-Ki;Lim, Jong Gwan;Kim, Mi-Hye
    • Journal of Digital Convergence
    • /
    • v.14 no.8
    • /
    • pp.233-243
    • /
    • 2016
  • Due to speech recognition problems in noisy environment, Audio Visual Speech Recognition (AVSR) system, which combines speech information and visual information, has been proposed since the mid-1990s,. and lip reading have played significant role in the AVSR System. This study aims to enhance recognition rate of utterance word using only lip shape detection for efficient AVSR system. After preprocessing for lip region detection, Convolution Neural Network (CNN) techniques are applied for utterance period detection and lip shape feature vector extraction, and Hidden Markov Models (HMMs) are then used for the recognition. As a result, the utterance period detection results show 91% of success rates, which are higher performance than general threshold methods. In the lip reading recognition, while user-dependent experiment records 88.5%, user-independent experiment shows 80.2% of recognition rates, which are improved results compared to the previous studies.

Development of Context Awareness and Service Reasoning Technique for Handicapped People (멀티 모달 감정인식 시스템 기반 상황인식 서비스 추론 기술 개발)

  • Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.1
    • /
    • pp.34-39
    • /
    • 2009
  • As a subjective recognition effect, human's emotion has impulsive characteristic and it expresses intentions and needs unconsciously. These are pregnant with information of the context about the ubiquitous computing environment or intelligent robot systems users. Such indicators which can aware the user's emotion are facial image, voice signal, biological signal spectrum and so on. In this paper, we generate the each result of facial and voice emotion recognition by using facial image and voice for the increasing convenience and efficiency of the emotion recognition. Also, we extract the feature which is the best fit information based on image and sound to upgrade emotion recognition rate and implement Multi-Modal Emotion recognition system based on feature fusion. Eventually, we propose the possibility of the ubiquitous computing service reasoning method based on Bayesian Network and ubiquitous context scenario in the ubiquitous computing environment by using result of emotion recognition.

Robust Speech Recognition using Vocal Tract Normalization for Emotional Variation (성도 정규화를 이용한 감정 변화에 강인한 음성 인식)

  • Kim, Weon-Goo;Bang, Hyun-Jin
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.6
    • /
    • pp.773-778
    • /
    • 2009
  • This paper studied the training methods less affected by the emotional variation for the development of the robust speech recognition system. For this purpose, the effect of emotional variations on the speech signal were studied using speech database containing various emotions. The performance of the speech recognition system trained by using the speech signal containing no emotion is deteriorated if the test speech signal contains the emotions because of the emotional difference between the test and training data. In this study, it is observed that vocal tract length of the speaker is affected by the emotional variation and this effect is one of the reasons that makes the performance of the speech recognition system worse. In this paper, vocal tract normalization method is used to develop the robust speech recognition system for emotional variations. Experimental results from the isolated word recognition using HMM showed that the vocal tract normalization method reduced the error rate of the conventional recognition system by 41.9% when emotional test data was used.

Long Distance Face Recognition System using the Automatic Face Image Creation by Distance (거리별 얼굴영상 자동 생성 방법을 이용한 원거리 얼굴인식 시스템)

  • Moon, Hae Min;Pan, Sung Bum
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.11
    • /
    • pp.137-145
    • /
    • 2014
  • This paper suggests an LDA-based long distance face recognition algorithm for intelligent surveillance system. The existing face recognition algorithm using single distance face image as training images caused a problem that face recognition rate is decreased with increasing distance. The face recognition algorithm using face images by actual distance as training images showed good performance. However, this also causes user inconvenience as it requires the user to move one to five meters in person to acquire face images for initial user registration. In this paper, proposed method is used for training images by using single distance face image to automatically create face images by various distances. The test result showed that the proposed face recognition technique generated better performance by average 16.3% in short distance and 18.0% in long distance than the technique using the existing single distance face image as training. When it was compared with the technique that used face images by distance as training, the performance fell 4.3% on average at a close distance and remained the same at a long distance.

Fast Algorithm for Recognition of Korean Isolated Words (한국어 고립단어인식을 위한 고속 알고리즘)

  • 남명우;박규홍;정상국;노승용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.50-55
    • /
    • 2001
  • This paper presents a korean isolated words recognition algorithm which used new endpoint detection method, auditory model, 2D-DCT and new distance measure. Advantages of the proposed algorithm are simple hardware construction and fast recognition time than conventional algorithms. For comparison with conventional algorithm, we used DTW method. At result, we got similar recognition rate for speaker dependent korean isolated words and better it for speaker independent korean isolated words. And recognition time of proposed algorithm was 200 times faster than DTW algorithm. Proposed algorithm had a good result in noise environments too.

  • PDF

The Effect of the Number of Phoneme Clusters on Speech Recognition (음성 인식에서 음소 클러스터 수의 효과)

  • Lee, Chang-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.11
    • /
    • pp.1221-1226
    • /
    • 2014
  • In an effort to improve the efficiency of the speech recognition, we investigate the effect of the number of phoneme clusters. For this purpose, codebooks of varied number of phoneme clusters are prepared by modified k-means clustering algorithm. The subsequent processing is fuzzy vector quantization (FVQ) and hidden Markov model (HMM) for speech recognition test. The result shows that there are two distinct regimes. For large number of phoneme clusters, the recognition performance is roughly independent of it. For small number of phoneme clusters, however, the recognition error rate increases nonlinearly as it is decreased. From numerical calculation, it is found that this nonlinear regime might be modeled by a power law function. The result also shows that about 166 phoneme clusters would be the optimal number for recognition of 300 isolated words. This amounts to roughly 3 variations per phoneme.

Performance Improvement of the Face Recognition Using the Properties of Wavelet Transform (웨이블릿 변환의 특성을 이용한 얼굴 인식 성능 개선)

  • Park, Kyung-Jun;Seo, Seok-Yong;Koh, Hyung-Hwa
    • Journal of Advanced Navigation Technology
    • /
    • v.17 no.6
    • /
    • pp.726-735
    • /
    • 2013
  • This paper proposed face recognition methods about performance improvement of the face recognition using the properties of wavelet transform. Using discrete wavelet transform is Daubechies D4 filter that is similar to mother wavelet transform. For discrete wavelet transform method, In this case, by using LL subband only we can reduce processing time and amount of memory in recognition processing. To improve recognition ratio without further loss of 2 dimensional data changing, We applies 2D LDA. We perform SVM training algorithm to the feature vector obtained by 2D LDA. Experiment is performed using ORL database set and Yale database set by Matlab program. Test result shows that proposed method is superior to existence methods in recognition rate and performance time.

An Efficient Hand Gesture Recognition Method using Two-Stream 3D Convolutional Neural Network Structure (이중흐름 3차원 합성곱 신경망 구조를 이용한 효율적인 손 제스처 인식 방법)

  • Choi, Hyeon-Jong;Noh, Dae-Cheol;Kim, Tae-Young
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.6
    • /
    • pp.66-74
    • /
    • 2018
  • Recently, there has been active studies on hand gesture recognition to increase immersion and provide user-friendly interaction in a virtual reality environment. However, most studies require specialized sensors or equipment, or show low recognition rates. This paper proposes a hand gesture recognition method using Deep Learning technology without separate sensors or equipment other than camera to recognize static and dynamic hand gestures. First, a series of hand gesture input images are converted into high-frequency images, then each of the hand gestures RGB images and their high-frequency images is learned through the DenseNet three-dimensional Convolutional Neural Network. Experimental results on 6 static hand gestures and 9 dynamic hand gestures showed an average of 92.6% recognition rate and increased 4.6% compared to previous DenseNet. The 3D defense game was implemented to verify the results of our study, and an average speed of 30 ms of gesture recognition was found to be available as a real-time user interface for virtual reality applications.

Salivary Flow According to Elderly's Whole Health and Oral Health Status: According to Application of Oral exercise and Salivary Gland Massage

  • Oh, Ji-Young;Noh, Eun-Mi;Park, Hye-Young;Lee, Min-Kyung;Kim, Hye-Jin
    • Biomedical Science Letters
    • /
    • v.25 no.3
    • /
    • pp.218-226
    • /
    • 2019
  • In old age, measures to cope with the natural phenomenon of aging and various diseases of the elderly due to the deterioration of physical function are also a challenge for this society. While interest in systematic health is increasing, it is true that awareness and interest in oral-related diseases is relatively lacking. This study aims to present basic data necessary to improve the quality of life for senior citizens aged 65 or older by improving the oral dryness caused by systemic health. By research method, improve oral dryness caused by whole-body health with the elderly over 65 and promote their oral health, inducing the increase of the salivary flow rate through oral health care education, oral exercise, and salivary gland massage. First, on the DMSQ according to the general characteristics of the elderly, the recognition of the whole body and oral health status, independent sample t-test and One-way ANOVA were conducted. Second, on changes in the salivary flow rate and saliva pH according to the general characteristics of the elderly, recognition of oral and whole-body health status, and whole-body health, paired samples t-test was conducted. Studies have shown that salivary gland flow increased significantly after oral exercise and salivary gland massage, the salivary flow rate significantly increased. In all variables of the recognition of the oral health status, the salivary flow rate increased after oral exercise and salivary gland massage, and in the whole-body health, regardless of hypertension, diabetes, cardiovascular disorders, and osteoporosis, the salivary flow rate increased after oral exercise and salivary gland massage, and the salivary flow rate increased after oral exercise and salivary gland massage if the subjects responded that they did not have thyroid abnormality, anemia, abnormalities of breathing, hypotension, gastrointestinal disturbance, or kidney diseases. As a comprehensive analysis of this study, many felt oral dryness when they had a problem with the whole-body health, and many felt oral dryness when they had a problem with oral health cognition. After applying oral exercise and salivary gland massage as intervention methods in the oral health care for the elderly, the salivary flow rate significantly increased, and it is judged that the methods were very effective for controlling oral dryness. Furthermore, it is judged that the factors affecting oral health, whole-body health, and oral dryness would be identified, which would be helpful for the promotion of whole-body health and oral health. It is judged that continuous research would be needed so that measures for the application of the oral care program and system for the elderly would be prepared in the future.

A Biological Fuzzy Multilayer Perceptron Algorithm

  • Kim, Kwang-Baek;Seo, Chang-Jin;Yang, Hwang-Kyu
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.3
    • /
    • pp.104-108
    • /
    • 2003
  • A biologically inspired fuzzy multilayer perceptron is proposed in this paper. The proposed algorithm is established under consideration of biological neuronal structure as well as fuzzy logic operation. We applied this suggested learning algorithm to benchmark problem in neural network such as exclusive OR and 3-bit parity, and to digit image recognition problems. For the comparison between the existing and proposed neural networks, the convergence speed is measured. The result of our simulation indicates that the convergence speed of the proposed learning algorithm is much faster than that of conventional backpropagation algorithm. Furthermore, in the image recognition task, the recognition rate of our learning algorithm is higher than of conventional backpropagation algorithm.