Search | Korea Science

Effective Syllable Modeling for Korean Speech Recognition Using Continuous HMM (연속 은닉 마코프 모델을 이용한 한국어 음성 인식을 위한 효율적 음절 모델링)

김봉완;이용주
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.1
- /
- pp.23-27
- /
- 2003
Recently attempts to we the syllable as the recognition unit to enhance performance in continuous speech recognition hate been reported. However, syllables are worse in their trainability than phones and the former have a disadvantage in that contort-dependent modeling is difficult across the syllable boundary since the number of models is much larger for syllables than for phones. In this paper, we propose a method to enhance the trainability for the syllables in Korean and phoneme-context dependent syllable modeling across the syllable boundary. An experiment in which the proposed method is applied to word recognition shows average 46.23% error reduction in comparison with the common syllable modeling. The right phone dependent syllable model showed 16.7% error reduction compared with a triphone model.
PDF KSCI

Phonetic Tied-Mixture Syllable Model for Efficient Decoding in Korean ASR (효율적 한국어 음성 인식을 위한 PTM 음절 모델)

Kim Bong-Wan;Lee Yong-Jn
- MALSORI
- /
- no.50
- /
- pp.139-150
- /
- 2004
A Phonetic Tied-Mixture (PTM) model has been proposed as a way of efficient decoding in large vocabulary continuous speech recognition systems (LVCSR). It has been reported that PTM model shows better performance in decoding than triphones by sharing a set of mixture components among states of the same topological location[5]. In this paper we propose a Phonetic Tied-Mixture Syllable (PTMS) model which extends PTM technique up to syllables. The proposed PTMS model shows 13% enhancement in decoding speed than PTM. In spite of difference in context dependent modeling (PTM : cross-word context dependent modeling, PTMS : word-internal left-phone dependent modeling), the proposed model shows just less than 1% degradation in word accuracy than PTM with the same beam width. With a different beam width, it shows better word accuracy than in PTM at the same or higher speed.
PDF

The influence of syllable frequency, syllable type and its position on naming two-syllable Korean words and pseudo-words (한글 두 글자 단어와 비단어의 명명에 글자 빈도, 글자 유형과 위치가 미치는 영향)

Myong Seok Shin;ChangHo Park
- Korean Journal of Cognitive Science
- /
- v.35 no.2
- /
- pp.97-112
- /
- 2024
This study investigated how syllable-level variables such as syllable frequency, syllable (i.e. vowel) type, presence of final consonants (i.e. batchim) and syllable position influence naming of both words and pseudo-words. The results of the linear mixed-effects model analysis showed that, for words, naming time decreased as the frequency of the first syllable increased, and when the first syllable had a final consonant. Additionally, words were named more accurately when they had vertical vowels compared to horizontal vowels. For pseudo-words, naming time decreased and accuracy rate increased as the frequency of the first or the second syllable increased. Furthermore, pseudo-words were named more accurately when they had vertical vowels compared to horizontal vowels. These results suggest that while the frequency of the second syllable had differential effects between words and pseudo-words, the frequency of the first syllable and the syllable type had consistent effects for both words and pseudo-words. The implications of this study were discussed concerning visual word recognition processing.
https://doi.org/10.19066/cogsci.2024.35.2.001 인용 PDF

Performance Improvement of Continuous Digits Speech Recognition using the Transformed Successive State Splitting and Demi-syllable pair (반음절쌍과 변형된 연쇄 상태 분할을 이용한 연속 숫자음 인식의 성능 향상)

Kim Dong-Ok;Park No-Jin
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.9 no.8
- /
- pp.1625-1631
- /
- 2005
This paper describes an optimization of a language model and an acoustic model that improve the ability of speech recognition with Korean nit digit. Recognition errors of the language model are decreasing by analysis of the grammatical feature of korean unit digits, and then is made up of fsn-node with a disyllable. Acoustic model make use of demi-syllable pair to decrease recognition errors by inaccuracy division of a phone, a syllable because of a monosyllable, a short pronunciation and an articulation. we have used the k-means clustering algorithm with the transformed successive state splining in feature level for the efficient modelling of the feature of recognition unit . As a result of experimentations, $10.5\%$ recognition rate is raised in the case of the proposed language model. The demi-syllable pair with an acoustic model increased $12.5\%$ recognition rate and $1.5\%$ recognition rate is improved in transformed successive state splitting.
PDF KSCI

Automatic Detection and Extraction of Transliterated Foreign Words Using Hidden Markov Model (은닉 마르코프 모델을 이용한 음차표기된 외래어의 자동인식 및 추출 기법)

오종훈;최기선
- Korean Journal of Cognitive Science
- /
- v.12 no.3
- /
- pp.19-28
- /
- 2001
In this paper, we describe an algorithm for transliterated foreign word extraction in Korean language. In the proposed method we reformulate the transliterated foreign word extraction problem as a syllable-tagging problem such that each syllable is tagged with a transliterated foreign syllable tag or a pure Korean syllable tag. Syllable sequences of Korean strings ale modeled by Hidden Markov Model whose state represents a character with binary marking to indicate whether the character forms a Korean word or not. The proposed method extracts a transliterated foreign word with high recall rate and precision rate. Moreover, our method shows good performance even with small-sized training corpora.
PDF

Prosodic Break Index Estimation using LDA and Tri-tone Model (LDA와 tri-tone 모델을 이용한 운율경계강도 예측)

강평수;엄기완;김진영
- The Journal of the Acoustical Society of Korea
- /
- v.18 no.7
- /
- pp.17-22
- /
- 1999
In this paper we propose a new mixed method of LDA and tri-tone model to predict Korean prosodic break indices(PBI) for a given utterance. PBI can be used as an important cue of syntactic discontinuity in continuous speech recognition(CSR). The model consists of three steps. At the first step, PBI was predicted with the information of syllable and pause duration through the linear discriminant analysis (LDA) method. At the second step, syllable tone information was used to estimate PBI. In this step we used vector quantization (VQ) for coding the syllable tones and PBI is estimated by tri-tone model. In the last step, two PBI predictors were integrated by a weight factor. The proposed method was tested on 200 literal style spoken sentences. The experimental results showed 72% accuracy.
PDF

The Korean Text-to-speech Using Syllable Units (음절 단위를 이용한 한국어 음성 합성)

김병수;윤기선;박성한
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.27 no.1
- /
- pp.143-150
- /
- 1990
In this paper, a rule-based method for improving the intelligibility of synthetic speech is proposed. A 12-pole linear prediction coding method is used to model syllable speech signals. A syllable concatenation rule for pause and frame rejection between syllables is developed to improve the naturalness of the synthetic speech. In addition, phonoligical structure transform rule and prosody rule are applied to the synthetic speech by LPC. The illustrative results demonstrate that the synthetic speech obtained by applying these rules has better naturalness than the synthetic speech by LPC.
PDF

A Reranking Model for Korean Morphological Analysis Based on Sequence-to-Sequence Model (Sequence-to-Sequence 모델 기반으로 한 한국어 형태소 분석의 재순위화 모델)

Choi, Yong-Seok;Lee, Kong Joo
- KIPS Transactions on Software and Data Engineering
- /
- v.7 no.4
- /
- pp.121-128
- /
- 2018
A Korean morphological analyzer adopts sequence-to-sequence (seq2seq) model, which can generate an output sequence of different length from an input. In general, a seq2seq based Korean morphological analyzer takes a syllable-unit based sequence as an input, and output a syllable-unit based sequence. Syllable-based morphological analysis has the advantage that unknown words can be easily handled, but has the disadvantages that morpheme-based information is ignored. In this paper, we propose a reranking model as a post-processor of seq2seq model that can improve the accuracy of morphological analysis. The seq2seq based morphological analyzer can generate K results by using a beam-search method. The reranking model exploits morpheme-unit embedding information as well as n-gram of morphemes in order to reorder K results. The experimental results show that the reranking model can improve 1.17% F1 score comparing with the original seq2seq model.
https://doi.org/10.3745/KTSDE.2018.7.4.121 인용 PDF KSCI

A Syllable Kernel based Sentiment Classification for Movie Reviews (음절 커널 기반 영화평 감성 분류)

Kim, Sang-Do;Park, Seong-Bae;Park, Se-Young;Lee, Sang-Jo;Kim, Kweon-Yang
- Journal of the Korean Institute of Intelligent Systems
- /
- v.20 no.2
- /
- pp.202-207
- /
- 2010
In this paper, we present an automatic sentiment classification method for on-line movie reviews that do not contain explicit sentiment rating scores. For the sentiment polarity classification, positive or negative, we use a Support Vector Machine classifier based on syllable kernel that is an extended model of string kernel. We give some experimental results which show that proposed syllable kernel model can be effectively used in sentiment classification tasks for on-line movie reviews that usually contain a lot of grammatical errors such as spacing or spelling errors.
https://doi.org/10.5391/JKIIS.2010.20.2.202 인용 PDF KSCI

Cloning of Korean Morphological Analyzers using Pre-analyzed Eojeol Dictionary and Syllable-based Probabilistic Model (기분석 어절 사전과 음절 단위의 확률 모델을 이용한 한국어 형태소 분석기 복제)

Shim, Kwangseob
- KIISE Transactions on Computing Practices
- /
- v.22 no.3
- /
- pp.119-126
- /
- 2016
In this study, we verified the feasibility of a Korean morphological analyzer that uses a pre-analyzed Eojeol dictionary and syllable-based probabilistic model. For the verification, MACH and KLT2000, Korean morphological analyzers, were cloned with a pre-analyzed eojeol dictionary and syllable-based probabilistic model. The analysis results were compared between the cloned morphological analyzer, MACH, and KLT2000. The 10 million Eojeol Sejong corpus was segmented into 10 sets for cross-validation. The 10-fold cross-validated precision and recall for cloned MACH and KLT2000 were 97.16%, 98.31% and 96.80%, 99.03%, respectively. Analysis speed of a cloned MACH was 308,000 Eojeols per second, and the speed of a cloned KLT2000 was 436,000 Eojeols per second. The experimental results indicated that a Korean morphological analyzer that uses a pre-analyzed eojeol dictionary and syllable-based probabilistic model could be used in practical applications.
https://doi.org/10.5626/KTCP.2016.22.3.119 인용 KSCI

Search Result 78, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)