• Title/Summary/Keyword: Phonemes

Search Result 226, Processing Time 0.024 seconds

Continuous Digit Recognition Using the Weight Initialization and LR Parser

  • Choi, Ki-Hoon;Lee, Seong-Kwon;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.2E
    • /
    • pp.14-23
    • /
    • 1996
  • This paper is a on the neural network to recognize the phonemes, the weight initialization to reduce learning speed, and LR parser for continuous speech recognition. The neural network spots the phonemes in continuous speech and LR parser parses the output of neural network. The whole phonemes recognized in neural network are divided into several groups which are grouped by the similarity of phonemes, and then each group consists of neural network. Each group of neural network to recognize the phonemes consisits of that recognize the phonemes of their own group and VGNN(Verify Group Neural Network) which judges whether the inputs are their own group or not. The weights of neural network are not initialized with random values but initialized from learning data to reduce learning speed. The LR parsing method applied to this paper is not a method which traces a unique path, but one which traces several possible paths because the output of neural network is not accurate. The parser processes the continuous speech frame by frame as accumulating the output of neural network through several possible paths. If this accumulated path-value drops below the threshold value, this path is deleted in possible parsing paths. This paper applies the continuous speech recognition system to the threshold value, this path is deleted in possible parsing paths. This paper applies the continuous speech recognition system to the continuous Korea digits recognition. The recognition rate of isolated digits is 97% in speaker dependent, and 75% in speaker dependent. The recognition rate of continuous digits is 74% in spaker dependent.

  • PDF

Speech Recognition Error Compensation using MFCC and LPC Feature Extraction Method (MFCC와 LPC 특징 추출 방법을 이용한 음성 인식 오류 보정)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.11 no.6
    • /
    • pp.137-142
    • /
    • 2013
  • Speech recognition system is input of inaccurate vocabulary by feature extraction case of recognition by appear result of unrecognized or similar phoneme recognized. Therefore, in this paper, we propose a speech recognition error correction method using phoneme similarity rate and reliability measures based on the characteristics of the phonemes. Phonemes similarity rate was phoneme of learning model obtained used MFCC and LPC feature extraction method, measured with reliability rate. Minimize the error to be unrecognized by measuring the rate of similar phonemes and reliability. Turned out to error speech in the process of speech recognition was error compensation performed. In this paper, the result of applying the proposed system showed a recognition rate of 98.3%, error compensation rate 95.5% in the speech recognition.

The Primitive Representation in Speech Perception: Phoneme or Distinctive Features (말지각의 기초표상: 음소 또는 변별자질)

  • Bae, Moon-Jung
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.157-169
    • /
    • 2013
  • Using a target detection task, this study compared the processing automaticity of phonemes and features in spoken syllable stimuli to determine the primitive representation in speech perception, phoneme or distinctive feature. For this, we modified the visual search task(Treisman et al., 1992) developed to investigate the processing of visual features(ex. color, shape or their conjunction) for auditory stimuli. In our task, the distinctive features(ex. aspiration or coronal) corresponded to visual primitive features(ex. color and shape), and the phonemes(ex. /$t^h$/) to visual conjunctive features(ex. colored shapes). The automaticity is measured by the set size effect that was the increasing amount of reaction time when the number of distracters increased. Three experiments were conducted. The laryngeal features(experiment 1), the manner features(experiment 2), and the place features(experiment 3) were compared with phonemes. The results showed that the distinctive features are consistently processed faster and automatically than the phonemes. Additionally there were differences in the processing automaticity among the classes of distinctive features. The laryngeal features are the most automatic, the manner features are moderately automatic and the place features are the least automatic. These results are consistent with the previous studies(Bae et al., 2002; Bae, 2010) that showed the perceptual hierarchy of distinctive features.

The Phonetic Difference Between the Korean Stop Series /p,t,k/ and the English /b,d,g/ Based on the VOT Value

  • Kang, Insun
    • Korean Journal of English Language and Linguistics
    • /
    • v.3 no.3
    • /
    • pp.427-452
    • /
    • 2003
  • Korean is famous for having all voiceless stop sounds. Korean does have voiced stops but they are considered to exist only as the allophones of word initial /p, t, k/. My experiment shows the English word initial stop sounds [b, d, g] and the Korean lax stop series /p, t, k/ in word initial position are similar in the range of voice onset time. If English word initial[b, d, g] sounds are posited as voiced, then Korean word initial /p, t, k/ should be classified as voiced also. Phonetically English /b, d, g/ phonemes and Korean /p, t, k/ phonemes are very similar except the word initial [p, t, k] are devoiced slightly more, but not significant enough to be classified as voiceless than English word initial [b, d, g]. If we posit /b, d, g/ as Korean phonemes, it explains why Korean /p, t, k/ series has the allophones [b, d, g] instead of fortis stops /p', t', k'/ in Korean even though /p', t', k'/ has less positive VOT value than /p, t, k/. If we posit /b, d, g/ as Korean phonemes, then it does not cause spelling or pronunciation confusion either when Koreans learn English or English speakers learn Korean.

  • PDF

Natural 3D Lip-Synch Animation Based on Korean Phonemic Data (한국어 음소를 이용한 자연스러운 3D 립싱크 애니메이션)

  • Jung, Il-Hong;Kim, Eun-Ji
    • Journal of Digital Contents Society
    • /
    • v.9 no.2
    • /
    • pp.331-339
    • /
    • 2008
  • This paper presents the development of certain highly efficient and accurate system for producing animation key data for 3D lip-synch animation. The system developed herein extracts korean phonemes from sound and text data automatically and then computes animation key data using the segmented phonemes. This animation key data is used for 3D lip-synch animation system developed herein as well as commercial 3D facial animation system. The conventional 3D lip-synch animation system segments the sound data into the phonemes based on English phonemic system and produces the lip-synch animation key data using the segmented phoneme. A drawback to this method is that it produces the unnatural animation for Korean contents. Another problem is that this method needs the manual supplementary work. In this paper, we propose the 3D lip-synch animation system that can segment the sound and text data into the phonemes automatically based on Korean phonemic system and produce the natural lip-synch animation using the segmented phonemes.

  • PDF

A Study on the Phonemic Analysis for Korean Speech Segmentation (한국어 음소분리에 관한 연구)

  • Lee, Sou-Kil;Song, Jeong-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.4E
    • /
    • pp.134-139
    • /
    • 2004
  • It is generally known that accurate segmentation is very necessary for both an individual word and continuous utterances in speech recognition. It is also commonly known that techniques are now being developed to classify the voiced and the unvoiced, also classifying the plosives and the fricatives. The method for accurate recognition of the phonemes isn't yet scientifically established. Therefore, in this study we analyze the Korean language, using the classification of 'Hunminjeongeum' and contemporary phonetics, with the frequency band, Mel band and Mel Cepstrum, we extract notable features of the phonemes from Korean speech and segment speech by the unit of the phonemes to normalize them. Finally, through the analysis and verification, we intend to set up Phonemic Segmentation System that will make us able to adapt it to both an individual word and continuous utterances.

Performance Evaluation of English Word Pronunciation Correction System (한국인을 위한 외국어 발음 교정 시스템의 개발 및 성능 평가)

  • Kim Mu Jung;Kim Hyo Sook;Kim Sun Ju;Kim Byoung Gi;Ha Jin-Young;Kwon Chul Hong
    • MALSORI
    • /
    • no.46
    • /
    • pp.87-102
    • /
    • 2003
  • In this paper, we present an English pronunciation correction system for Korean speakers and show some of experimental results on it. The aim of the system is to detect mispronounced phonemes in spoken words and to give appropriate correction comments to users. There are several English pronunciation correction systems adopting speech recognition technology, however, most of them use conventional speech recognition engines. From this reason, they could not give phoneme based correction comments to users. In our system, we build two kinds of phoneme models: standard native speaker models and Korean's error models. We also design recognition network based on phonemes to detect Koreans' common mispronunciations. We get 90% detection rate in insertion/deletion/replacement of phonemes, but we cannot get high detection rate in diphthong split and accents.

  • PDF

A Study on the Categorization of Context-dependent Phoneme using Decision Tree Modeling (결정 트리 모델링에 의한 한국어 문맥 종속 음소 분류 연구)

  • 이선정
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.2
    • /
    • pp.195-202
    • /
    • 2001
  • In this paper, we show a study on how to model a phoneme of which acoustic feature is changed according to both left-hand and right-hand phonemes. For this purpose, we make a comparative study on two kinds of algorithms; a unit reduction algorithm and decision tree modeling. The unit reduction algorithm uses only statistical information while the decision tree modeling uses statistical information and Korean acoustical information simultaneously. Especially, we focus on how to model context-dependent phonemes based on decision tree modeling. Finally, we show the recognition rate when context-dependent phonemes are obtained by the decision tree modeling.

  • PDF

On the Syllabic Consonants in Present-Day English

  • Oda, Toshihiro
    • Proceedings of the KSPS conference
    • /
    • 2000.07a
    • /
    • pp.189-198
    • /
    • 2000
  • /$t{\partial}n$/, /$d{\partial}n$/, /$t{\partial}l$/ and /d{\partial}l$/, on the one hand, are the typical phonemes of syllabic consonants. On the other hand, /${\int}{\partial}n$/ most plausibly gives rise to the syllabic consonants. /$t{\partial}r$/ and /$d{\partial}r/ can he syllabic. However, because lip-rounded consonants strengthen the character of consonantal phonemes, they are not so appropriate. Apart from phonemes, some familiar words also could be almost always syllabic. From the historical perspective, we can say that the position of syllabic consonants is typically the second syllables of two-syllabic words and 1.hat the underlying schwa does not always exist. In terms of the syllable structure, the syllables which include syllabic consonants are totally different from both stressed syllables and the other unstressed syllables.

  • PDF

Typical Behaviors of Young Children Reading Hangul (유아의 한글읽기 행동 유형)

  • Seo, Myung-Suk;Kim, Young-Sil
    • Korean Journal of Child Studies
    • /
    • v.27 no.1
    • /
    • pp.113-124
    • /
    • 2006
  • Korean children reading Hangul was studied in children between 2 and 5 years of age. Subjects were 400 young children in each age group from kindergartens or day care centers in 6 cities of Jeon-buk Province. Teachers used a checklist based on Lee, Cha-Suk(2003) to assess children's reading ability. Data were analyzed by frequency, percentage, and $x^2$ using SPSS 10.0 program. Results showed age differences in young children's reading of Hangul. Developmental levels consisted of looking at pictures because of absence of linguistic awareness about words, skipping pages of text without pictures, pronouncing phonemes, being aware of phonemes and of the difference between pictures and print, and knowing that the same phonemes can be applied to different words.

  • PDF