• Title/Summary/Keyword: Korean phoneme

Search Result 331, Processing Time 0.025 seconds

Selective Adaptation of Speaker Characteristics within a Subcluster Neural Network

  • Haskey, S.J.;Datta, S.
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.464-467
    • /
    • 1996
  • This paper aims to exploit inter/intra-speaker phoneme sub-class variations as criteria for adaptation in a phoneme recognition system based on a novel neural network architecture. Using a subcluster neural network design based on the One-Class-in-One-Network (OCON) feed forward subnets, similar to those proposed by Kung (2) and Jou (1), joined by a common front-end layer. the idea is to adapt only the neurons within the common front-end layer of the network. Consequently resulting in an adaptation which can be concentrated primarily on the speakers vocal characteristics. Since the adaptation occurs in an area common to all classes, convergence on a single class will improve the recognition of the remaining classes in the network. Results show that adaptation towards a phoneme, in the vowel sub-class, for speakers MDABO and MWBTO Improve the recognition of remaining vowel sub-class phonemes from the same speaker

  • PDF

Performance of the Phoneme Segmenter in Speech Recognition System (음성인식 시스템에서의 음소분할기의 성능)

  • Lee, Gwang-seok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.705-708
    • /
    • 2009
  • This research describes a neural network-based phoneme segmenter for recognizing spontaneous speech. The input of the phoneme segmenter for spontaneous speech is 16th order mel-scaled FFT, normalized frame energy, ratio of energy among 0~3[KHz] band and more than 3[KHz] band. All the features are differences of two consecutive 10 [msec] frame. The main body of the segmenter is single-hidden layer MLP(Multi-Layer Perceptron) with 72 inputs, 20 hidden nodes, and one output node. The segmentation accuracy is 78% with 7.8% insertion.

  • PDF

A study on the Recognition of Korean Proverb Using Neural Network and Markov Model (신경회로망과 Markov 모델을 이용한 한국어 속담 인식에 관한 연구)

  • 홍기원;김선일;이행세
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.12
    • /
    • pp.1663-1669
    • /
    • 1995
  • This paper is a study on the recognition of Korean proverb using neural network and Markov model. The neural network uses, at the stage of training neurons, features such as the rate of zero crossing, short-term energy and PLP-Cepstrum, covering a time of 300ms long. Markov models were generated by the recognized phoneme strings. The recognition of words and proverbs using Markov models have been carried out. Experimental results show that phoneme and word recognition rates are 81. 2%, 94.0% respectively for Korean proverb recognition experiments.

  • PDF

Korean Phoneme Recognition using Modified Self Organizing Feature Map (수정된 자기 구조화 특징 지도를 이용한 한국어 음소 인식)

  • Choi, Doo-Il;Lee, Su-Jin;Park, Sang-Hui
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1991 no.11
    • /
    • pp.38-43
    • /
    • 1991
  • In order to cluster the Input pattern neatly, some neural network modified from Kohonen's self organizing feature map is introduced and Korean phoneme recognition experiments are performed using the modified self organizing feature map(MSOFM) and the auditory model.

  • PDF

A Pilot Study on The Correlation of Acoustic Image and Sound Wave Form on Japanese /K/ (청각인상과 음성파형간의 관계구명을 위한 일본어 /k/의 기초 연구)

  • Lee Jae Kang;Kwon Chul Hong
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.52-55
    • /
    • 2003
  • Most Korean students who have not studied Japanese pronounced Japanese phoneme /k/ as /kk/ in Korean, regardless of sex. But analysis considering many phoneme environments gives us different results. Although the middle syllable which comes after 'the joon' does not show any specific distinctions, the rest cases show that the half of the subjects pronounced it as /kk/ and the other half as /k/. To draw concrete conclusions, further studies must be done.

  • PDF

Mouth Shape Trajectory Generation Using Hangul Phoneme Analysis (한글 음절 분류를 통한 입 모양 궤적 생성)

  • 박유신;김종수;김태용;최종수
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.53-56
    • /
    • 2003
  • In this paper, we propose a new method which generates the trajectory of the mouth shape for the characters by the user inputs. It is based on the character at a basis syllable and can be suitable to the mouth shape generation. In this paper, we understand the principle of the Korean language creation and find the similarity for the form of the mouth shape and select it as a basic syllable. We also consider the articulation of this phoneme for it and create a new mouth shape trajectory and apply at face of an 3D avatar.

  • PDF

An English-to-Korean Transliteration Model based on Grapheme and Phoneme (자소 및 음소 정보를 이용한 영어-한국어 음차표기 모델)

  • Oh Jong-Hoon;Choi Key-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.4
    • /
    • pp.312-326
    • /
    • 2005
  • There has been increasing interest in English-to-Korean transliteration recently. Previous ,works are related to a direct method like $\rightarrow$Korean graphemes> and a pivot method like $\rightarrow$English phoneme$\rightarrow$Korean graphemes>. Though most of the previous works focus on the direct method, transliteration, however, is a phonetic process rather than an orthographic one. In this point of view, we present an English-Korean transliteration model using grapheme and phoneme information. Unlike the previous works, our method uses phonetic information such as phonemes and their context. Moreover, we also use graphemes corresponding to phonemes. Our method shows about $60\%$ word accuracy.

A Study on the Korean Syllable As Recognition Unit (인식 단위로서의 한국어 음절에 대한 연구)

  • Kim, Yu-Jin;Kim, Hoi-Rin;Chung, Jae-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.3
    • /
    • pp.64-72
    • /
    • 1997
  • In this paper, study and experiments are performed for finding recognition unit fit which can be used in large vocabulary recognition system. Specifically, a phoneme that is currently used as recognition unit and a syllable in which Korean is well characterized are selected. From comparisons of recognition experiments, the study is performed whether a syllable can be considered as recognition unit of Korean recognition system. For report of an objective result of the comparison experiment, we collected speech data of a male speaker and processed them by hand-segmentation for phoneme boundary and labeling to construct speech database. And for training and recognition based on HMM, we used HTK (HMM Tool Kit) 2.0 of commercial tool from Entropic Co. to experiment in same condition. We applied two HMM model topologies, 3 emitting state of 5 state and 6 emitting state of 8 state, in Continuous HMM on training of each recognition unit. We also used 3 sets of PBW (Phonetically Balanced Words) and 1 set of POW(Phonetically Optimized Words) for training and another 1 set of PBW for recognition, that is "Speaker Dependent Medium Vocabulary Size Recognition." Experiments result reports that recognition rate is 95.65% in phoneme unit, 94.41% in syllable unit and decoding time of recognition in syllable unit is faster by 25% than in phoneme.

  • PDF

An ERP Study of the Perception of English High Front Vowels by Native Speakers of Korean and English (영어전설고모음 인식에 대한 ERP 실험연구: 한국인과 영어원어민을 대상으로)

  • Yun, Yungdo
    • Phonetics and Speech Sciences
    • /
    • v.5 no.3
    • /
    • pp.21-29
    • /
    • 2013
  • The mismatch negativity (MMN) is known to be a fronto-centrally negative component of the auditory event-related potentials (ERP). $N\ddot{a}\ddot{a}t\ddot{a}nen$ et al. (1997) and Winkler et al. (1999) discuss that MMN acts as a cue to a phoneme perception in the ERP paradigm. In this study a perception experiment based on an ERP paradigm to check how Korean and American English speakers perceive the American English high front vowels was conducted. The study found that the MMN obtained from both Korean and American English speakers was shown around the same time after they heard F1s of English high front vowels. However, when the same groups heard English words containing them, the American English listeners' MMN was shown to be a little faster than the Korean listeners' MMN. These findings suggest that non-speech sounds, such as F1s of vowels, may be processed similarly across speakers of different languages; however, phonemes are processed differently; a native language phoneme is processed faster than a non-native language phoneme.

Prosodic Contour Generation for Korean Text-To-Speech System Using Artificial Neural Networks

  • Lim, Un-Cheon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2E
    • /
    • pp.43-50
    • /
    • 2009
  • To get more natural synthetic speech generated by a Korean TTS (Text-To-Speech) system, we have to know all the possible prosodic rules in Korean spoken language. We should find out these rules from linguistic, phonetic information or from real speech. In general, all of these rules should be integrated into a prosody-generation algorithm in a TTS system. But this algorithm cannot cover up all the possible prosodic rules in a language and it is not perfect, so the naturalness of synthesized speech cannot be as good as we expect. ANNs (Artificial Neural Networks) can be trained to learn the prosodic rules in Korean spoken language. To train and test ANNs, we need to prepare the prosodic patterns of all the phonemic segments in a prosodic corpus. A prosodic corpus will include meaningful sentences to represent all the possible prosodic rules. Sentences in the corpus were made by picking up a series of words from the list of PB (phonetically Balanced) isolated words. These sentences in the corpus were read by speakers, recorded, and collected as a speech database. By analyzing recorded real speech, we can extract prosodic pattern about each phoneme, and assign them as target and test patterns for ANNs. ANNs can learn the prosody from natural speech and generate prosodic patterns of the central phonemic segment in phoneme strings as output response of ANNs when phoneme strings of a sentence are given to ANNs as input stimuli.