Search | Korea Science

Weighted Disassemble-based Correction Method to Improve Recognition Rates of Korean Text in Signboard Images (간판영상에서 한글 인식 성능향상을 위한 가중치 기반 음소 단위 분할 교정)

Lee, Myung-Hun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Lee, Guee-Sang;Kim, Sun-Hee
- The Journal of the Korea Contents Association
- /
- v.12 no.2
- /
- pp.105-115
- /
- 2012
In this paper, we propose a correction method using phoneme unit segmentation to solve misrecognition of Korean Texts in signboard images using weighted Disassemble Levenshtein Distance. The proposed method calculates distances of recognized texts which are segmented into phoneme units and detects the best matched texts from signboard text database. For verifying the efficiency of the proposed method, a database dictionary is built using 1.3 million words of nationwide signboard through removing duplicated words. We compared the proposed method to Levenshtein Distance and Disassemble Levenshtein Distance which are common representative text string comparison algorithms. As a result, the proposed method based on weighted Disassemble Levenshtein Distance represents an improvement in recognition rates 29.85% and 6% on average compared to that of conventional methods, respectively.
https://doi.org/10.5392/JKCA.2012.12.02.105 인용 PDF KSCI

The Structure of Korean Consonants as Perceived by the Japanese (일본인이 지각하는 한국어 자음의 구조)

Bae, Moon-Jung;Kim, Jung-Oh
- Korean Journal of Cognitive Science
- /
- v.19 no.2
- /
- pp.163-175
- /
- 2008
Twelve Japanese students living in South Korea have been examined for their perceptual identification of an initial consonant in Korean syllables with or without a white noise. A confusion matrix was then subject to analyses of additive clustering, individual difference scaling, and probability of information transmission, the results of which were also compared to those of South Koreans. The Japanese in the present experiment confused /다/and/타/ most frequently, followed by /가/ and /카/, /자, 차, 짜/, /타/ and /따/, and so on. The results of additive clustering analysis of the Japanese significantly differed from those of the South Koreans. Individual difference scaling revealed dimensions of sonorant, aspiration and coronal. While South Koreans showed binary values on aspiration and tenseness dimensions, the Japanese did continuous values on such dimensions. An information transmission probability analysis revealed that the Japanese participants could not perceive very well such larynx features as tenseness and aspiration compared to the South Korean participants. The former group, however, perceived very well place of articulation features such as labial and coronal. The present results suggest that an approach dealing with structures of base representations is important in understanding the phonological categories of languages.
PDF

Improvement of Keyword Spotting Performance Using Normalized Confidence Measure (정규화 신뢰도를 이용한 핵심어 검출 성능향상)

Kim, Cheol;Lee, Kyoung-Rok;Kim, Jin-Young;Choi, Seung-Ho;Choi, Seung-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.4
- /
- pp.380-386
- /
- 2002
Conventional post-processing as like confidence measure (CM) proposed by Rahim calculates phones' CM using the likelihood between phoneme model and anti-model, and then word's CM is obtained by averaging phone-level CMs[1]. In conventional method, CMs of some specific keywords are tory low and they are usually rejected. The reason is that statistics of phone-level CMs are not consistent. In other words, phone-level CMs have different probability density functions (pdf) for each phone, especially sri-phone. To overcome this problem, in this paper, we propose normalized confidence measure. Our approach is to transform CM pdf of each tri-phone to the same pdf under the assumption that CM pdfs are Gaussian. For evaluating our method we use common keyword spotting system. In that system context-dependent HMM models are used for modeling keyword utterance and contort-independent HMM models are applied to non-keyword utterance. The experiment results show that the proposed NCM reduced FAR (false alarm rate) from 0.44 to 0.33 FA/KW/HR (false alarm/keyword/hour) when MDR is about 8%. It achieves 25% improvement of FAR.
PDF KSCI

A Study on the Development of Korea Telecom Automatic Voice Recognition System (음성인식에 의한 연구센타 부서안내 시스팀 개발에 관한 연구)

Koo, Myoung-Wan;Sohn, Il-Hyun;Doh, Sam-Joo;Lee, Jong-Rak
- Annual Conference on Human and Language Technology
- /
- 1992.10a
- /
- pp.185-192
- /
- 1992
이 논문에서는 음성인식기술을 이용한 연구센타 부서안내 시스팀(KARS:Korea Telecom Automatic voice Recognition system)에 대하여 기술하였다. 이 시스팀은 기본적으로 음성응답 시스팀과 유사하지만 명령입력을 위해 푸시버튼 대신 음성을 이용한다는 점이 다르다. 사용자가 마이크로폰을 통해 음성명령을 입력하면, 이 시스팀은 사용자의 음성명령을 인식하여 연구센타내 각 부서의 간략한 소개, 전화번호 및 위치를 안내해 준다. 이 시스팀은 HMM(Hidden Markov Model)을 이용하는 화자독립 격리단어 인식시스팀으로서 116개의 부서이름과 7개의 제어용 단어로 구성되어 있는 123개 단어를 인식할 수 있다. 이 시스팀은 음소와 유사한 한국어 서브워드(subword)를 HMM의 기본단위로 사용하며 인식 실험결과 98.6%의 인식율을 얻을 수 있었다.
PDF

Error Correction Methode Improve System using Out-of Vocabulary Rejection (미등록어 거절을 이용한 오류 보정 방법 개선 시스템)

Ahn, Chan-Shik;Oh, Sang-Yeob
- Journal of Digital Convergence
- /
- v.10 no.8
- /
- pp.173-178
- /
- 2012
In the generated model for the recognition vocabulary, tri-phones which is not make preparations are produced. Therefore this model does not generate an initial estimate of parameter words, and the system can not configure the model appear as disadvantages. As a result, the sophistication of the Gaussian model is fall will degrade recognition. In this system, we propose the error correction system using out-of vocabulary rejection algorithm. When the systems are creating a vocabulary recognition model, recognition rates are improved to refuse the vocabulary which is not registered. In addition, this system is seized the lexical analysis and meaning using probability distributions, and this system deactivates the string before phoneme change was applied. System analysis determine the rate of error correction using phoneme similarity rate and reliability, system performance comparison as a result of error correction rate improve represent 2.8% by method using error patterns, fault patterns, meaning patterns.
https://doi.org/10.14400/JDPM.2012.10.8.173 인용 PDF

A Study on the Improvement of Automatic Text Recognition of Road Signs Using Location-based Similarity Verification (위치기반 유사도 검증을 이용한 도로표지 안내지명 자동인식 개선방안 연구)

Chong, Kyusoo
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.18 no.6
- /
- pp.241-250
- /
- 2019
Road signs are guide facilities for road users, and the Ministry of Land, Infrastructure and Transport has established and operated a system to enhance the convenience of managing these road signs. The role of road signs will decrease in the future autonomous driving, but they will continue to be needed. For the accurate mechanical recognition of texts on road signs, automatic road sign recognition equipment has been developed and it has applied image-based text recognition technology. Yet there are many cases of misrecognition due to irregular specifications and external environmental factors such as manual manufacturing, illumination, light reflection, and rainfall. The purpose of this study is to derive location-based destination names for finding misrecognition errors that cannot be overcome by image analysis, and to improve the automatic recognition of road signs destination names by using Levenshtein similarity verification method based on phoneme separation.
https://doi.org/10.12815/kits.2019.18.6.241 인용 PDF KSCI

Performance Improvement of Connected Digit Recognition by Considering Phonemic Variations in Korean Digit and Speaking Styles (한국어 숫자음의 음운변화 및 화자 발성특성을 고려한 연결숫자 인식의 성능향상)

송명규;김형순
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.4
- /
- pp.401-406
- /
- 2002
Each Korean digit is composed of only a syllable, so recognizers as well as Korean often have difficulty in recognizing it. When digit strings are pronounced, the original pronunciation of each digit is largely changed due to the co-articulation effect. In addition to these problems, the distortion caused by various channels and noises degrades the recognition performance of Korean connected digit string. This paper dealt with some techniques to improve recognition performance of it, which include defining a set of PLUs by considering phonemic variations in Korean digit and constructing a recognizer to handle speakers various speaking styles. In the speaker-independent connected digit recognition experiments using telephone speech, the proposed techniques with 1-Gaussian/state gave string accuracy of 83.2%, i. e., 7.2% error rate reduction relative to baseline system. With 11-Gaussians/state, we achieved the highest string accuracy of 91.8%, i. e., 4.7% error rate reduction.
PDF KSCI

A study on the connected-digit recognition using MLP-VQ and Weighted DHMM (MLP-VQ와 가중 DHMM을 이용한 연결 숫자음 인식에 관한 연구)

Chung, Kwang-Woo;Hong, Kwang-Seok
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.35S no.8
- /
- pp.96-105
- /
- 1998
The aim of this paper is to propose the method of WDHMM(Weighted DHMM), using the MLP-VQ for the improvement of speaker-independent connect-digit recognition system. MLP neural-network output distribution shows a probability distribution that presents the degree of similarity between each pattern by the non-linear mapping among the input patterns and learning patterns. MLP-VQ is proposed in this paper. It generates codewords by using the output node index which can reach the highest level within MLP neural-network output distribution. Different from the old VQ, the true characteristics of this new MLP-VQ lie in that the degree of similarity between present input patterns and each learned class pattern could be reflected for the recognition model. WDHMM is also proposed. It can use the MLP neural-network output distribution as the way of weighing the symbol generation probability of DHMMs. This newly-suggested method could shorten the time of HMM parameter estimation and recognition. The reason is that it is not necessary to regard symbol generation probability as multi-dimensional normal distribution, as opposed to the old SCHMM. This could also improve the recognition ability by 14.7% higher than DHMM, owing to the increase of small caculation amount. Because it can reflect phone class relations to the recognition model. The result of my research shows that speaker-independent connected-digit recognition, using MLP-VQ and WDHMM, is 84.22%.
PDF

Search Result 18, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)