• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.029 seconds

A Minimum-Error-Rate Training Algorithm for Pattern Classifiers and Its Application to the Predictive Neural Network Models (패턴분류기를 위한 최소오차율 학습알고리즘과 예측신경회로망모델에의 적용)

  • 나경민;임재열;안수길
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.12
    • /
    • pp.108-115
    • /
    • 1994
  • Most pattern classifiers have been designed based on the ML (Maximum Likelihood) training algorithm which is simple and relatively powerful. The ML training is an efficient algorithm to individually estimate the model parameters of each class under the assumption that all class models in a classifier are statistically independent. That assumption, however, is not valid in many real situations, which degrades the performance of the classifier. In this paper, we propose a minimum-error-rate training algorithm based on the MAP (Maximum a Posteriori) approach. The algorithm regards the normalized outputs of the classifier as estimates of the a posteriori probability, and tries to maximize those estimates. According to Bayes decision theory, the proposed algorithm satisfies the condition of minimum-error-rate classificatin. We apply this algorithm to NPM (Neural Prediction Model) for speech recognition, and derive new disrminative training algorithms. Experimental results on ten Korean digits recognition have shown the reduction of 37.5% of the number of recognition errors.

  • PDF

Scalable High-quality Speech Reconstruction in Distributed Speech Recognition Environments (분산음성인식 환경에서 서버에서의 스케일러블 고품질 음성복원)

  • Yoon, Jae-Sam;Kim, Hong-Kook;Kang, Byung-Ok
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.423-424
    • /
    • 2007
  • In this paper, we propose a scalable high-quality speech reconstruction method for distributed speech recognition (DSR). It is difficult to reconstruct speech of high quality with MFCCs at the DSR server. Depending on the bit-rate available by the DSR system, we can send additional information associated with speech coding to the DSR sorrel, where the bit-rate is variable from 4.8 kbit/s to 11.4 kbit/s. The experimental results show that the speech quality reproduced by the proposed method when the bit-rate is 11.4 kbit/s is comparable with that of ITU-T G.729 under both ideal channel and frame error channel conditions while the performance of DSR is maintained to that of wireline speech recognition.

  • PDF

A Study Of Handwritten Digit Recognition By Neural Network Trained With The Back-Propagation Algorithm Using Generalized Delta Rule (신경망 회로를 이용한 필기체 숫자 인식에 관할 연구)

  • Lee, Kye-Han;Chung, Chin-Hyun
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.2932-2934
    • /
    • 1999
  • In this paper, a scheme for recognition of handwritten digits using a multilayer neural network trained with the back-propagation algorithm using generalized delta rule is proposed. The neural network is trained with hand written digit data of different writers and different styles. One of the purpose of the work with neural networks is the minimization of the mean square error(MSE) between actual output and desired one. The back-propagation algorithm is an efficient and very classical method. The back-propagation algorithm for training the weights in a multilayer net uses the steepest descent minimization procedure and the sigmoid threshold function. As an error rate is reduced, recognition rate is improved. Therefore we propose a method that is reduced an error rate.

  • PDF

Enhanced Fuzzy Single Layer Perceptron

  • Chae, Gyoo-Yong;Eom, Sang-Hee;Kim, Kwang-Baek
    • Journal of information and communication convergence engineering
    • /
    • v.2 no.1
    • /
    • pp.36-39
    • /
    • 2004
  • In this paper, a method of improving the learning speed and convergence rate is proposed to exploit the advantages of artificial neural networks and neuro-fuzzy systems. This method is applied to the XOR problem, n bit parity problem, which is used as the benchmark in the field of pattern recognition. The method is also applied to the recognition of digital image for practical image application. As a result of experiment, it does not always guarantee convergence. However, the network showed considerable improvement in learning time and has a high convergence rate. The proposed network can be extended to any number of layers. When we consider only the case of the single layer, the networks had the capability of high speed during the learning process and rapid processing on huge images.

A Study on the Improvement of Tesseract-based OCR Model Recognition Rate using Ontology (온톨로지를 이용한 tesseract 기반의 OCR 모델 인식률 향상에 관한 연구)

  • Hwang, Chi-gon;Yun, Dai Yeol;Yoon, Chang-Pyo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.438-440
    • /
    • 2021
  • With the development of machine learning, artificial intelligence techniques are being applied in various fields. Among these fields, there is an OCR technique that converts characters in images into text. The tesseract developed by HP is one of those techniques. However, the recognition rate for recognizing characters in images is still low. To this end, we try to improve the conversion rate of the text of the image through the post-processing process that recognizes the context using the ontology.

  • PDF

A Study on Recognition Units for Korean Speech Recognition (한국어 분절음 인식을 위한 인식 단위에 대한 연구)

  • ;;Michael W. Macon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.6
    • /
    • pp.47-52
    • /
    • 2000
  • In the case of making large vocabulary speech recognition system, it is better to use the segment than the syllable or the word as the recognition mit. In this paper, we study on the proper recognition units for Korean speech recognition. For experiments, we use the speech toolkit of OGI in U.S.A. The result shows that the recognition rate of the case in which the diphthong is established as a single unit is superior to that of the case in which the diphthong is established as two units, i.e. a glide plus a vowel. And also, the recognition rate of the case in which the biphone is used as the recognition unit is better than that of the case in which the mono-phoneme is used.

  • PDF

Development of Feature Extraction Algorithm for Finger Vein Recognition (지정맥 인식을 위한 특징 검출 알고리즘 개발)

  • Kim, Taehoon;Lee, Sangjoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.9
    • /
    • pp.345-350
    • /
    • 2018
  • This study is an algorithm for detecting vein pattern features important for finger vein recognition. The feature detection algorithm is important because it greatly affects recognition results in pattern recognition. The recognition rate is degraded because the reference is changed according to the finger position change. In addition, the image obtained by irradiating the finger with infrared light is difficult to separate the image background and the blood vessel pattern, and the detection time is increased because the image preprocessing process is performed. For this purpose, the presented algorithm can be performed without image preprocessing, and the detection time can be reduced. SWDA (Down Slope Trace Waveform) algorithm is applied to the finger vein images to detect the fingertip position and vein pattern. Because of the low infrared transmittance, relatively dark vein images can be detected with minimal detection error. In addition, the fingertip position can be used as a reference in the classification stage to compensate the decrease in the recognition rate. If we apply algorithms proposed to various recognition fields such as palm and wrist, it is expected that it will contribute to improvement of biometric feature detection accuracy and reduction of recognition performance time.

Development an Android based OCR Application for Hangul Food Menu (한글 음식 메뉴 인식을 위한 OCR 기반 어플리케이션 개발)

  • Lee, Gyu-Cheol;Yoo, Jisang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.5
    • /
    • pp.951-959
    • /
    • 2017
  • In this paper, we design and implement an Android-based Hangul food menu recognition application that recognizes characters from images captured by a smart phone. Optical Character Recognition (OCR) technology is divided into preprocessing, recognition and post-processing. In the preprocessing process, the characters are extracted using Maximally Stable Extremal Regions (MSER). In recognition process, Tesseract-OCR, a free OCR engine, is used to recognize characters. In the post-processing process, the wrong result is corrected by using the dictionary DB for the food menu. In order to evaluate the performance of the proposed method, experiments were conducted to compare the recognition performance using the actual menu plate as the DB. The recognition rate measurement experiment with OCR Instantly Free, Text Scanner and Text Fairy, which is a character recognizing application in Google Play Store, was conducted. The experimental results show that the proposed method shows an average recognition rate of 14.1% higher than other techniques.

A Basic Performance Evaluation of the Speech Recognition APP of Standard Language and Dialect using Google, Naver, and Daum KAKAO APIs (구글, 네이버, 다음 카카오 API 활용앱의 표준어 및 방언 음성인식 기초 성능평가)

  • Roh, Hee-Kyung;Lee, Kang-Hee
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.12
    • /
    • pp.819-829
    • /
    • 2017
  • In this paper, we describe the current state of speech recognition technology and identify the basic speech recognition technology and algorithms first, and then explain the code flow of API necessary for speech recognition technology. We use the application programming interface (API) of Google, Naver, and Daum KaKao, which have the most famous search engine among the speech recognition APIs, to create a voice recognition app in the Android studio tool. Then, we perform a speech recognition experiment on people's standard words and dialects according to gender, age, and region, and then organize the recognition rates into a table. Experiments were conducted on the Gyeongsang-do, Chungcheong-do, and Jeolla-do provinces where the degree of tongues was severe. And Comparative experiments were also conducted on standardized dialects. Based on the resultant sentences, the accuracy of the sentence is checked based on spacing of words, final consonant, postposition, and words and the number of each error is represented by a number. As a result, we aim to introduce the advantages of each API according to the speech recognition rate, and to establish a basic framework for the most efficient use.

A Study on Neural Networks for Korean Phoneme Recognition (한국어 음소 인식을 위한 신경회로망에 관한 연구)

  • 최영배
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1992.06a
    • /
    • pp.61-65
    • /
    • 1992
  • This paper presents a study on Neural Networks for Phoneme Recognition and performs phoneme recognition using TDNN(Time Delay Neural Network). Also, this paper proposes new training algorithm for speech recognition using neural nets that proper to large scale TDNN. Because phoneme recognition is indispensable for continuous speech recognition, this paper uses TDNN to get accurate recognition result of phoneme. And this paper proposes new training algorithm that can converge TDNN to optimal state regardless of the number of phoneme to be recognized. The result of recognition on three phoneme classes shows recognition rate of 9.1%. And this paper proves that proposed algorithm is a efficient method for high performance and reducing convergence time.

  • PDF