• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.025 seconds

Error Correction Methode Improve System using Out-of Vocabulary Rejection (미등록어 거절을 이용한 오류 보정 방법 개선 시스템)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.10 no.8
    • /
    • pp.173-178
    • /
    • 2012
  • In the generated model for the recognition vocabulary, tri-phones which is not make preparations are produced. Therefore this model does not generate an initial estimate of parameter words, and the system can not configure the model appear as disadvantages. As a result, the sophistication of the Gaussian model is fall will degrade recognition. In this system, we propose the error correction system using out-of vocabulary rejection algorithm. When the systems are creating a vocabulary recognition model, recognition rates are improved to refuse the vocabulary which is not registered. In addition, this system is seized the lexical analysis and meaning using probability distributions, and this system deactivates the string before phoneme change was applied. System analysis determine the rate of error correction using phoneme similarity rate and reliability, system performance comparison as a result of error correction rate improve represent 2.8% by method using error patterns, fault patterns, meaning patterns.

Korean Word Recognition Using Vector Quantization Speaker Adaptation (벡터 양자화 화자적응기법을 사용한 한국어 단어 인식)

  • Choi, Kap-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.4
    • /
    • pp.27-37
    • /
    • 1991
  • This paper proposes the ESFVQ(energy subspace fuzzy vector quantization) that employs energy subspaces to reduce the quantizing distortion which is less than that of a fuzzy vector quatization. The ESFVQ is applied to a speaker adaptation method by which Korean words spoken by unknown speakers are recognized. By generating mapped codebooks with fuzzy histogram according to each energy subspace in the training procedure and by decoding a spoken word through the ESFVQ in the recognition proecedure, we attempt to improve the recognition rate. The performance of the ESFVQ is evaluated by measuring the quantizing distortion and the speaker adaptive recognition rate for DDD telephone area names uttered by 2 males and 1 female. The quatizing distortion of the ESFVQ is reduced by 22% than that of a vector quantization and by 5% than that of a fuzzy vector quantization, and the speaker adaptive recognition rate of the ESFVQ is increased by 26% than that without a speaker adaptation and by 11% than that of a vector quantization.

  • PDF

A Study on the Spoken KOrean-Digit Recognition Using the Neural Netwok (神經網을 利用한 韓國語 數字音 認識에 관한 硏究)

  • Park, Hyun-Hwa;Gahang, Hae Dong;Bae, Keun Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.3
    • /
    • pp.5-13
    • /
    • 1992
  • Taking devantage of the property that Korean digit is a mono-syllable word, we proposed a spoken Korean-digit recognition scheme using the multi-layer perceptron. The spoken Korean-digit is divided into three segments (initial sound, medial vowel, and final consonant) based on the voice starting / ending points and a peak point in the middle of vowel sound. The feature vectors such as cepstrum, reflection coefficients, ${\Delta}$cepstrum and ${\Delta}$energy are extracted from each segment. It has been shown that cepstrum, as an input vector to the neural network, gives higher recognition rate than reflection coefficients. Regression coefficients of cepstrum did not affect as much as we expected on the recognition rate. That is because, it is believed, we extracted features from the selected stationary segments of the input speech signal. With 150 ceptral coefficients obtained from each spoken digit, we achieved correct recognition rate of 97.8%.

  • PDF

A study on performance improvement of neural network using output probability of HMM (HMM의 출력확률을 이용한 신경회로망의 성능향상에 관한 연구)

  • Pyo Chang Soo;Kim Chang Keun;Hur Kang In
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.1 no.1
    • /
    • pp.1-6
    • /
    • 2000
  • In this paper, the hybrid system of HMM and neural network is proposed and show better recognition rate of the post-process procedure which minimizes the process error of recognition than that of HMM(Hidden Markov Model) only used. After the HMM training by training data, testing data that are not taken part in the training are sent to HMM. The output probability from HMM output by testing data is used for the training data of the neural network, post processor. After neural network training, the hybrid system is completed. This hybrid system makes the recognition rate improvement of about $4.5\%$ in MLP and about $2\%$ in RBFN and gives the solution to training time of conventional hybrid system and to decrease of the recognition rate due to the lack of training data in real-time speech recognition system.

  • PDF

A Study on Face Recognition using DCT/LDA (DCT/LDA 기반 얼굴 인식에 관한 연구)

  • Kim Hyoung-Joon;Jung Byunghee;Kim Whoi-Yul
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.6
    • /
    • pp.55-62
    • /
    • 2005
  • This paper proposes a method to recognize a face using DCT/LDA where LDA is applied to DCT coefficients of an input face image. In the proposed method, SSS problem of LDA due to less number of training data than the size of feature space can be avoided by expressing an input image in low dimensional space using DCT coefficients. In terms of the recognition rate, both the proposed method and the PCA/LDA method have shown almost equal performance while the training time of the proposed method is much shorter than the other. This is because DCT has the fixed number of basis vectors while the property of energy compaction rate is similar to that of PCA. Although depending on the number of coefficients employed for the recognition, the experimental results show that the performance of the proposed method in terms of recognition rate is very comparable to PCA/LDA method and other DCT/LDA methods, and it can be trained 13,000 times faster than PCA/LDA method.

Implementation of a Falls Recognition System Using Acceleration and Angular Velocity Signals (가속도 및 각속도 신호를 이용한 낙상 인지 시스템 구현)

  • Park, Geun-Chul;Jeon, A-Young;Lee, Sang-Hoon;Son, Jung-Man;Kim, Myoung-Chul;Jeon, Gye-Rok
    • Journal of Sensor Science and Technology
    • /
    • v.22 no.1
    • /
    • pp.54-64
    • /
    • 2013
  • In this study, we developed a falling recognition system to transmit SMS data through CDMA communication using a three axises acceleration sensor and a two axises gyro sensor. 5 healthy men were selected into a control group, and the fall recognition system using the three axises acceleration sensor and the two axises gyro sensor was devised to conduct an experiment. The system was attached to the upper of their sternum. According to the experiment protocol, the experiment was carried out 3 times repeatedly divided into 3 specific protocols: falling during gait, falling in stopped state, and falling in everyday life. Data obtained in the falling recognition system and LabVIEW 8.5 were used to decide if falling corresponds to that regulated in an analysis program applying an algorithm proposed in this study. In addition, results from falling recognition were transmitted to designated cellular phone in a SMS (Shot Message Service) form. These research results show that an erroneous detection rate of falling reached 19% in applying an acceleration signal only; 6% in applying an angular velocity; and 2% in applying a proposed algorithm. Such finding suggests that an erroneous detection rate of falling is improved when the proposed algorithm is applied incorporated with acceleration and angular velocity. In this study therefore, we proposed that a falling recognition system implemented in this study can make a contribution to the recognition of falling of the aged or the disabled.

A Study on Speech Recognition using Recurrent Neural Networks (회귀신경망을 이용한 음성인식에 관한 연구)

  • 한학용;김주성;허강인
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.3
    • /
    • pp.62-67
    • /
    • 1999
  • In this paper, we investigates a reliable model of the Predictive Recurrent Neural Network for the speech recognition. Predictive Neural Networks are modeled by syllable units. For the given input syllable, then a model which gives the minimum prediction error is taken as the recognition result. The Predictive Neural Network which has the structure of recurrent network was composed to give the dynamic feature of the speech pattern into the network. We have compared with the recognition ability of the Recurrent Network proposed by Elman and Jordan. ETRI's SAMDORI has been used for the speech DB. In order to find a reliable model of neural networks, the changes of two recognition rates were compared one another in conditions of: (1) changing prediction order and the number of hidden units: and (2) accumulating previous values with self-loop coefficient in its context. The result shows that the optimum prediction order, the number of hidden units, and self-loop coefficient have differently responded according to the structure of neural network used. However, in general, the Jordan's recurrent network shows relatively higher recognition rate than Elman's. The effects of recognition rate on the self-loop coefficient were variable according to the structures of neural network and their values.

  • PDF

Continuous Digit Recognition Using the Weight Initialization and LR Parser

  • Choi, Ki-Hoon;Lee, Seong-Kwon;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.2E
    • /
    • pp.14-23
    • /
    • 1996
  • This paper is a on the neural network to recognize the phonemes, the weight initialization to reduce learning speed, and LR parser for continuous speech recognition. The neural network spots the phonemes in continuous speech and LR parser parses the output of neural network. The whole phonemes recognized in neural network are divided into several groups which are grouped by the similarity of phonemes, and then each group consists of neural network. Each group of neural network to recognize the phonemes consisits of that recognize the phonemes of their own group and VGNN(Verify Group Neural Network) which judges whether the inputs are their own group or not. The weights of neural network are not initialized with random values but initialized from learning data to reduce learning speed. The LR parsing method applied to this paper is not a method which traces a unique path, but one which traces several possible paths because the output of neural network is not accurate. The parser processes the continuous speech frame by frame as accumulating the output of neural network through several possible paths. If this accumulated path-value drops below the threshold value, this path is deleted in possible parsing paths. This paper applies the continuous speech recognition system to the threshold value, this path is deleted in possible parsing paths. This paper applies the continuous speech recognition system to the continuous Korea digits recognition. The recognition rate of isolated digits is 97% in speaker dependent, and 75% in speaker dependent. The recognition rate of continuous digits is 74% in spaker dependent.

  • PDF

Development of vision system for the recognition of character image which was included at the slab image (슬라브 영상에 포함된 문자영상의 인식을 위한 비전시스템의 개발)

  • Park, Sang-Gug
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.1
    • /
    • pp.95-100
    • /
    • 2007
  • In the steel & iron processing line, some characters are marked for the material management in the surface of material. This paper describes about the developed results of vision system for the recognition of material management characters, which was included in the slab image. Our vision system for the character recognition includes that CCD camera system which acquire slab image, optical transmission system which transmit captured image to the long distance, input and output system for the interface with existing system and monitoring system for the checking of recognition results. We have installed our vision system at the continuous casting line and tested. Also, we have performed inspection of durability, reliability and recognition rate. Through the testing, we have confirmed that our system have high recognition rate, 97.4%.

  • PDF

A Study on the Korean Broadcasting Speech Recognition (한국어 방송 음성 인식에 관한 연구)

  • 김석동;송도선;이행세
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1
    • /
    • pp.53-60
    • /
    • 1999
  • This paper is a study on the korean broadcasting speech recognition. Here we present the methods for the large vocabuary continuous speech recognition. Our main concerns are the language modeling and the search algorithm. The used acoustic model is the uni-phone semi-continuous hidden markov model and the used linguistic model is the N-gram model. The search algorithm consist of three phases in order to utilize all available acoustic and linguistic information. First, we use the forward Viterbi beam search to find word end frames and to estimate related scores. Second, we use the backword Viterbi beam search to find word begin frames and to estimate related scores. Finally, we use A/sup */ search to combine the above two results with the N-grams language model and to get recognition results. Using these methods maximum 96.0% word recognition rate and 99.2% syllable recognition rate are achieved for the speaker-independent continuous speech recognition problem with about 12,000 vocabulary size.

  • PDF