Search | Korea Science

Speech Recognition in the Car Noise Environment (자동차 소음 환경에서 음성 인식)

김완구;차일환;윤대희
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.30B no.2
- /
- pp.51-58
- /
- 1993
This paper describes the development of a speaker-dependent isolated word recognizer as applied to voice dialing in a car noise environment. for this purpose, several methods to improve performance under such condition are evaluated using database collected in a small car moving at 100km/h The main features of the recognizer are as follow: The endpoint detection error can be reduced by using the magnitude of the signal which is inverse filtered by the AR model of the background noise, and it can be compensated by using variants of the DTW algorithm. To remove the noise, an autocorrelation subtraction method is used with the constraint that residual energy obtainable by linear predictive analysis should be positive. By using the noise rubust distance measure, distortion of the feature vector is minimized. The speech recognizer is implemented using the Motorola DSP56001(24-bit general purpose digital signal processor). The recognition database is composed of 50 Korean names spoken by 3 male speakers. The recognition error rate of the system is reduced to 4.3% using a single reference pattern for each word and 1.5% using 2 reference patterns for each word.
PDF

Developments of Glove-based Input Device. (장갑형 입력장치의 개발)

원대희;이호길;김진영;박종현
- Proceedings of the Korean Society of Precision Engineering Conference
- /
- 2001.04a
- /
- pp.211-216
- /
- 2001
Recently, the research for the mobile computing such as PDA, Palm PC and wearable computing related technologies is widely under development, specially for the input device. Among the mobile input methods are speech recognition, handwriting recognition and cording type. However these systems have the problems of the data input appraratus like input speed and recognition rate. This paper presents the Glove-based input device which could solve the system's data input problem. By the experimental results suggest the method of proposional input method that utilize the hand's movement is appropriate for the effective mobile input devices.
PDF

Robust Speech Recognition Using Real-Time High Order Statistics Normalization and Smoothing Filter (실시간 고차통계 정규화와 Smoothing 필터를 이용한 강인한 음성인식)

Jeong, Ju-Hyun;Song, Hwa-Jeon;Kim, Hyung-Soon
- Proceedings of the KSPS conference
- /
- 2005.04a
- /
- pp.91-94
- /
- 2005
The performance of speech recognition is degraded by the mismatch between training and test environments. Many methods have been presented to compensate for additive noise and channel effect in the cepstral domain, and Cepstral Mean Subtraction (CMS) is the representative method among them. Recently, high order cepstral moment normalization method has introduced to improve recognition accuracy. In this paper, we apply high order moment normalization method and smoothing filter for real-time processing. In experiments using Aurora2 DB, we obtained error rate reduction of 49.7% with the proposed algorithm in comparison with baseline system.
PDF

Chinese Pronunciation Correction System for Korean learners (한국인을 위한 중국어 발음 교정 시스템)

Kim, Hyo-Sook;Kim, Sun-Ju;Kang, Hyo-Won;Kim, Mu-Jung;Ha, Jin-Young
- Proceedings of the KSPS conference
- /
- 2005.04a
- /
- pp.45-48
- /
- 2005
This study is about constructing L2 pronunciation correction system for L1 speakers using speech technology. Chinese pronunciation system consists of initials, finals and tones. Initials/finals are in segmental level and tones are in suprasegmental level. So different method could be used assessing Korean users' Chinese. The recognition rate of initials is 81.9% and that of finals is 68.7% in the standard acoustic model. Differ from native speech recognition, nonnative speech recognition could be promoted by additional modeling using L2 speakers' speech. As a first step for the those task we analysed nonnative speech and then set a strategy for modeling Korean speakers'.
PDF

A Distinction of the Korean Character, Chinese Character and English Character using the Threshold Stroke Density (임계 획 밀도를 이용한 한글, 한자, 영문구분)

원남식
- Journal of Korea Society of Industrial Information Systems
- /
- v.5 no.4
- /
- pp.32-38
- /
- 2000
It is an important factor to distinguish the kind of the character for increasing recognition rate before the character recognition in the document recognition system composed of the multi-font and multi-letter. All the letters of each country have a various men characteristic in the each composition. In this paper, we used the stroke density as a method to distinguish the letter, and it has been adopted Korean, English and Chinese character. Input data is processed by the normalization to adopt multi-font document. Proposed method has been proved by the results of experiment the fact that the distinction probability of the Korean and English is more than 80%.
PDF

Learning Rules for Partially Occluded Object Recognition (부분적으로 가려진 물체의 인식 룰의 습득)

정재영;김문현
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.27 no.6
- /
- pp.954-962
- /
- 1990
Experties of recognizing an object despite of every possible occlusions among objects is difficult to be provided directly to a system. In this paper, we propose a method for inferring inherent shape-characteirstics of an object from training views provided. The method learns rules incrementally by alternating the rule induction process from limited number of training views and the rule verification process from the following taining views. The learned rules are represented using logical expressions to enhance the readability. Thr proposed method is tested by simulating occlusions on 2-dimensional objects to examine the learning process and to show improvement of recognition rate. Thr result shows that it can be applied to a practical system for 3-dimensional object recognition.
PDF

A MNN(Modular Neural Network) for Robot Endeffector Recognition (로봇 Endeffector 인식을 위한 모듈라 신경회로망)

김영부;박동선
- Proceedings of the IEEK Conference
- /
- 1999.06a
- /
- pp.496-499
- /
- 1999
This paper describes a medular neural network(MNN) for a vision system which tracks a given object using a sequence of images from a camera unit. The MNN is used to precisely recognize the given robot endeffector and to minize the processing time. Since the robot endeffector can be viewed in many different shapes in 3-D space, a MNN structure, which contains a set of feedforwared neural networks, co be more attractive in recognizing the given object. Each single neural network learns the endeffector with a cluster of training patterns. The training patterns for a neural network share the similar charateristics so that they can be easily trained. The trained MNN is less sensitive to noise and it shows the better performance in recognizing the endeffector. The recognition rate of MNN is enhanced by 14% over the single neural network. A vision system with the MNN can precisely recognize the endeffector and place it at the center of a display for a remote operator.
PDF

A Study on the Speech Recognition using Advanced Competitive Learning (개선된 경쟁학습을 이용한 음성인식)

Song, Joon-Gyu;Lee, Dong-Wook;Kim, Young-T.
- Proceedings of the KIEE Conference
- /
- 1997.11a
- /
- pp.594-596
- /
- 1997
This paper presents the speaker-dependent Korean isolated digit recognition system using advanced competitive learning. Since competitive learning algorithms are easy and simple to implement, they are used in various fields. The proposed recognition algorithm consists of three procedures: comparing winning number of codebook vectors, selecting the representative vector out of codebook vectors, and generating a new codebook with the representative vectors. In this paper, we use a sound blaster 16 for obtaining speech data. Speech data are sampled by 16 bits and 11 kHz sampling rate.
PDF

Discriminative Training of Predictive Neural Network Models (예측신경회로망 모델의 변별력 있는 학습)

Na, Kyung-Min;Rheem, Jae-Yeol;Ann, Sou-Guil
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.1E
- /
- pp.64-70
- /
- 1994
Predictive neural network models are powerful speech recognition models based on a nonlinear pattern prediction. But those models suffer from poor discrimination between acoustically similar words. In this paper we propose an discriminative training algorithm for predictive neural network models. This algorithm is derived from GPD (Generalized Probabilistic Descent) algorithm coupled with MCEF(Minimum Classification Error Formulation). It allows direct minimization of a recognition error rate. Evaluation of our training algoritym on ten Korean digits shows its effectiveness by 30% reduction of recognition error.
PDF

Performance Improvement of SPLICE-based Noise Compensation for Robust Speech Recognition (강인한 음성인식을 위한 SPLICE 기반 잡음 보상의 성능향상)

Kim, Hyung-Soon;Kim, Doo-Hee
- Speech Sciences
- /
- v.10 no.3
- /
- pp.263-277
- /
- 2003
One of major problems in speech recognition is performance degradation due to the mismatch between the training and test environments. Recently, Stereo-based Piecewise LInear Compensation for Environments (SPLICE), which is frame-based bias removal algorithm for cepstral enhancement using stereo training data and noisy speech model as a mixture of Gaussians, was proposed and showed good performance in noisy environments. In this paper, we propose several methods to improve the conventional SPLICE. First we apply Cepstral Mean Subtraction (CMS) as a preprocessor to SPLICE, instead of applying it as a postprocessor. Secondly, to compensate residual distortion after SPLICE processing, two-stage SPLICE is proposed. Thirdly we employ phonetic information for training SPLICE model. According to experiments on the Aurora 2 database, proposed method outperformed the conventional SPLICE and we achieved a 50% decrease in word error rate over the Aurora baseline system.
PDF

Search Result 2,809, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)