• Title/Summary/Keyword: continuous speech

검색결과 314건 처리시간 0.03초

CPAP(Continuous Positive Airway Pressure) 치료 프로그램이 취학 전 구순.구개열 아동의 과대비성 개선에 미치는 효과 (The Effects of CPAP Therapy Program on Hypernasality in Preschool Children with Cleft Lips and Palates)

  • 조성미;정옥란;한기환
    • 음성과학
    • /
    • 제14권4호
    • /
    • pp.261-271
    • /
    • 2007
  • The purpose of this study was to determine the effectiveness of CPAP (Continuous Positive Airway Pressure) therapy on the treatment of hypernasality in patients with cleft lips and palates. 7 preschool children with severe hypernasality participated in the study. Acoustic measurements of nasality were done by using the NasalView (version 1.31). Results showed that the nasalance values were reduced linearly in both vowels according to the treatment period. The sharp treatment effect was observed at the beginning stage. The nasality values of the vowel /i/ showed a sharp decrease at the Evaluation Phase 1 and 2 and a small increase at the Phase 4 followed by a drop in the end. Further studies would be desirable for various patients with different disorder types.

  • PDF

A Korean Flight Reservation System Using Continuous Speech Recognition

  • Choi, Jong-Ryong;Kim, Bum-Koog;Chung, Hyun-Yeol;Nakagawa, Seiichi
    • The Journal of the Acoustical Society of Korea
    • /
    • 제15권3E호
    • /
    • pp.60-65
    • /
    • 1996
  • This paper describes on the Korean continuous speech recognition system for flight reservation. It adopts a frame-synchronous One-Pass DP search algorithm driven by syntactic constraints of context free grammar(CFG). For recognition, 48 phoneme-like units(PLU) were defined and used as basic units for acoustic modeling of Korean. This modeling was conducted using a HMM technique, where each model has 4-states 3-continuous output probability distributions and 3-discrete-duration distributions. Language modeling by CFG was also applied to the task domain of flight reservation, which consisted of 346 words and 422 rewriting rules. In the tests, the sentence recognition rate of 62.6% was obtained after speaker adaptation.

  • PDF

Frame-Correlated HMM을 이용한 음성 인식 (On the Use of a Frame-Correlated HMM for Speech Recognition)

  • 김남수
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 제11회 음성통신 및 신호처리 워크샵 논문집 (SCAS 11권 1호)
    • /
    • pp.223-228
    • /
    • 1994
  • We propose a novel method to incorporate temporal correlations into a speech recognition system based on the conventional hidden Markov model. With the proposed method using the extended logarithmic pool, we approximate a joint conditional PD by separate conditional PD's associated with respective components of conditions. We provide a constrained optimization algorithm with which we can find the optimal value for the pooling weights. The results in the experiments of speaker-independent continuous speech recognition with frame correlations show error reduction by 13.7% with the proposed methods as compared to that without frame correlations.

  • PDF

한국어 음성인식을 위한 효율적인 사전 구성에 관한 연구 (Study on Efficient Generation of Dictionary for Korean Vocabulary Recognition)

  • 이상복;최대림;김종교
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2002년도 11월 학술대회지
    • /
    • pp.41-44
    • /
    • 2002
  • This paper is related to the enhancement of speech recognition rate using enhanced pronunciation dictionary. Modern large vocabulary, continuous speech recognition systems have pronunciation dictionaries. A pronunciation dictionary provides pronunciation information for each word in the vocabulary in phonemic units, which are modeled in detail by the acoustic models. But in most speech recognition system based on Hidden Markov Model, actual pronunciation variations are disregarded. Without the pronunciation variations in the speech recognition system, the phonetic transcriptions in the dictionary do not match the actual occurrences in the database. In this paper, we proposed the unvoiced rule of semivowel in allophone rules to pronunciation dictionary. Experimental results on speech recognition system give higher performance than existing pronunciation dictionaries.

  • PDF

Review And Challenges In Speech Recognition (ICCAS 2005)

  • Ahmed, M.Masroor;Ahmed, Abdul Manan Bin
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.1705-1709
    • /
    • 2005
  • This paper covers review and challenges in the area of speech recognition by taking into account different classes of recognition mode. The recognition mode can be either speaker independent or speaker dependant. Size of the vocabulary and the input mode are two crucial factors for a speech recognizer. The input mode refers to continuous or isolated speech recognition system and the vocabulary size can be small less than hundred words or large less than few thousands words. This varies according to system design and objectives.[2]. The organization of the paper is: first it covers various fundamental methods of speech recognition, then it takes into account various deficiencies in the existing systems and finally it discloses the various probable application areas.

  • PDF

음성신호의 Sub-Nyquist 비균일 표준화 및 완전 복구에 관한 연구 (Sub-Nyquist Nonuniform Sampling and Perfect Reconstruction of Speech Signals)

  • 이희영
    • 음성과학
    • /
    • 제12권2호
    • /
    • pp.153-170
    • /
    • 2005
  • The sub-Nyquist nonuniform sampling (SNNS) and the perfect reconstruction (PR) formula are proposed for the development of a systematic method to obtain minimal representation of a speech signal. In the proposed method, the instantaneous sampling frequency (ISF) varies, depending on the least upper boundary of spectral support of a speech signal in time-frequency domain (TFD). The definition of the instantaneous bandwidth (IB), which determines the ISF and is used for generating the set of samples that represent continuous-time signals perfectly, is given. Also, the spectral characteristics of the sampled data generated by the sub-Nyquist nonuniform sampling method is analyzed. The proposed method doesn't generate the redundant samples due to the time-varying property of the instantaneous bandwidth of a speech signal.

  • PDF

자동차 잡음환경 고립단어 음성인식에서의 VTS와 PMC의 성능비교 (Performance Comparison between the PMC and VTS Method for the Isolated Speech Recognition in Car Noise Environments)

  • 정용주;이승욱
    • 음성과학
    • /
    • 제10권3호
    • /
    • pp.251-261
    • /
    • 2003
  • There has been many research efforts to overcome the problems of speech recognition in noisy conditions. Among the noise-robust speech recognition methods, model-based adaptation approaches have been shown quite effective. Particularly, the PMC (parallel model combination) method is very popular and has been shown to give considerably improved recognition results compared with the conventional methods. In this paper, we experimented with the VTS (vector Taylor series) algorithm which is also based on the model parameter transformation but has not attracted much interests of the researchers in this area. To verify the effectiveness of it, we employed the algorithm in the continuous density HMM (Hidden Markov Model). We compared the performance of the VTS algorithm with the PMC method and could see that the it gave better results than the PMC method.

  • PDF

반향제거기를 갖는 자동차 실내 환경에서의 음성인식 (Robust speech recognition in car environment with echo canceller)

  • 박철호;허원철;배건성
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 추계 학술대회 발표논문집
    • /
    • pp.147-150
    • /
    • 2005
  • The performance of speech recognition in car environment is severely degraded when there is music or news coming from a radio or a CD player. Since reference signals are available from the audio unit in the car, it is possible to remove them with an adaptive filter. In this paper, we present experimental results of speech recognition in car environment using the echo canceller. For this, we generate test speech signals by adding music or news to the car noisy speech from Aurora2 DB. The HTK-based continuous HMT system is constructed for a recognition system. In addition, the MMSE-STSA method is used to the output of the echo canceller to remove the residual noise more.

  • PDF

자동차 제어용 음성 인식시스템 구현 (An Implementation of Speech Recognition System for Car's Control)

  • 이광석;김현덕
    • 한국정보통신학회논문지
    • /
    • 제5권3호
    • /
    • pp.451-458
    • /
    • 2001
  • 본 연구는 자동차내의 각종 제어장치들을 음성으로 실시간 제어하기 위한 음성제어 시스템을 제안하고 실험적으로 검증하였다. 실시간 제어음성 인식시스템은 8bit-l0MHz로 A/D변환된 음성 데이터를 실시간으로 시작점과 끝점을 검출한 후, One Pass DP법으로 인식하였으며 그 결과를 모니터에 문장으로 출력하며 제어용 인터페이스에 제어데이터를 보내도록 구성하였다. HMM모델은 자동차내의 장치들을 제어하기 위한 제어음성 및 숫자음들로 구성되는 연속음성을 학습 및 모델링 하였다. 단어.제어문들의 인식률은 평균 97.3%, 숫자음의 경우는 평균 96.3% 정도의 인식률을 얻을 수 있었다.

  • PDF

순환 신경망 모델을 이용한 한국어 음소의 음성인식에 대한 연구 (A Study on the Speech Recognition of Korean Phonemes Using Recurrent Neural Network Models)

  • 김기석;황희영
    • 대한전기학회논문지
    • /
    • 제40권8호
    • /
    • pp.782-791
    • /
    • 1991
  • In the fields of pattern recognition such as speech recognition, several new techniques using Artifical Neural network Models have been proposed and implemented. In particular, the Multilayer Perception Model has been shown to be effective in static speech pattern recognition. But speech has dynamic or temporal characteristics and the most important point in implementing speech recognition systems using Artificial Neural Network Models for continuous speech is the learning of dynamic characteristics and the distributed cues and contextual effects that result from temporal characteristics. But Recurrent Multilayer Perceptron Model is known to be able to learn sequence of pattern. In this paper, the results of applying the Recurrent Model which has possibilities of learning tedmporal characteristics of speech to phoneme recognition is presented. The test data consist of 144 Vowel+ Consonant + Vowel speech chains made up of 4 Korean monothongs and 9 Korean plosive consonants. The input parameters of Artificial Neural Network model used are the FFT coefficients, residual error and zero crossing rates. The Baseline model showed a recognition rate of 91% for volwels and 71% for plosive consonants of one male speaker. We obtained better recognition rates from various other experiments compared to the existing multilayer perceptron model, thus showed the recurrent model to be better suited to speech recognition. And the possibility of using Recurrent Models for speech recognition was experimented by changing the configuration of this baseline model.