Search | Korea Science

Improvement of an Automatic Segmentation for TTS Using Voiced/Unvoiced/Silence Information (유/무성/묵음 정보를 이용한 TTS용 자동음소분할기 성능향상)

Kim Min-Je;Lee Jung-Chul;Kim Jong-Jin
- MALSORI
- /
- no.58
- /
- pp.67-81
- /
- 2006
For a large corpus of time-aligned data, HMM based approaches are most widely used for automatic segmentation, providing a consistent and accurate phone labeling scheme. There are two methods for training in HMM. Flat starting method has a property that human interference is minimized but it has low accuracy. Bootstrap method has a high accuracy, but it has a defect that manual segmentation is required In this paper, a new algorithm is proposed to minimize manual work and to improve the performance of automatic segmentation. At first phase, voiced, unvoiced and silence classification is performed for each speech data frame. At second phase, the phoneme sequence is aligned dynamically to the voiced/unvoiced/silence sequence according to the acoustic phonetic rules. Finally, using these segmented speech data as a bootstrap, phoneme model parameters based on HMM are trained. For the performance test, hand labeled ETRI speech DB was used. The experiment results showed that our algorithm achieved 10% improvement of segmentation accuracy within 20 ms tolerable error range. Especially for the unvoiced consonants, it showed 30% improvement.
PDF

Korean Phoneme Recognition Model with Deep CNN (Deep CNN 기반의 한국어 음소 인식 모델 연구)

Hong, Yoon Seok;Ki, Kyung Seo;Gweon, Gahgene
- Annual Conference of KIPS
- /
- 2018.05a
- /
- pp.398-401
- /
- 2018
본 연구에서는 심충 합성곱 신경망(Deep CNN)과 Connectionist Temporal Classification (CTC) 알고리즘을 사용하여 강제정렬 (force-alignment)이 이루어진 코퍼스 없이도 학습이 가능한 음소 인식 모델을 제안한다. 최근 해외에서는 순환 신경망(RNN)과 CTC 알고리즘을 사용한 딥 러닝 기반의 음소 인식 모델이 활발히 연구되고 있다. 하지만 한국어 음소 인식에는 HMM-GMM 이나 인공 신경망과 HMM 을 결합한 하이브리드 시스템이 주로 사용되어 왔으며, 이 방법 은 최근의 해외 연구 사례들보다 성능 개선의 여지가 적고 전문가가 제작한 강제정렬 코퍼스 없이는 학습이 불가능하다는 단점이 있다. 또한 RNN 은 학습 데이터가 많이 필요하고 학습이 까다롭다는 단점이 있어, 코퍼스가 부족하고 기반 연구가 활발하게 이루어지지 않은 한국어의 경우 사용에 제약이 있다. 이에 본 연구에서는 강제정렬 코퍼스를 필요로 하지 않는 CTC 알고리즘을 도입함과 동시에, RNN 에 비해 더 학습 속도가 빠르고 더 적은 데이터로도 학습이 가능한 합성곱 신경망(CNN)을 사용하여 딥 러닝 모델을 구축하여 한국어 음소 인식을 수행하여 보고자 하였다. 이 모델을 통해 본 연구에서는 한국어에 존재하는 49 가지의 음소를 추출하는 세 종류의 음소 인식기를 제작하였으며, 최종적으로 선정된 음소 인식 모델의 PER(phoneme Error Rate)은 9.44 로 나타났다. 선행 연구 사례와 간접적으로 비교하였을 때, 이 결과는 제안하는 모델이 기존 연구 사례와 대등하거나 조금 더 나은 성능을 보인다고 할 수 있다.
https://doi.org/10.3745/PKIPS.y2018m05a.398 인용 PDF

Psychological Effects of Gamification on Young Learners: Focusing on a Serious Game for English Phoneme Discrimination (기능성게임을 활용한 게이미피케이션 영어 발음 학습이 초등학생의 정의적 영역에 미치는 영향)

Lee, Sun-Young;Park, Joo-Hyun;Choi, Jung-Hye Fran
- Journal of Korea Game Society
- /
- v.19 no.2
- /
- pp.111-122
- /
- 2019
This study investigated the psychological effects of using a serious game with young learners in the English classroom compared with those of a dictionary application. A tablet PC-based serious game was created for the training of English phoneme discrimination for Korean 6th graders, and its psychological effects were measured using a paper-based survey and face-to-face interviews. The overall results revealed that the serious game had more positive psychological effects on young learners than the dictionary app. These findings provide supporting empirical evidence for using serious games in English classrooms.
https://doi.org/10.7583/JKGS.2019.19.2.111 인용 PDF KSCI HTML

A COMPARATIVE STUDY ON AUDITORY ATTENTION AND PHONEME DIFFERENTIAL ABILITY AMONG CHILDREN WITH READING DISABILITY AND WITH ATTENTION DEFICIT/HYPERACTIVITY (읽기 장애와 주의력 결핍/과잉 운동 장애아동의 주의력 과제와 음소 변별 과제 수행 비교 - 청각 과제를 중심으로 -)

Lee, Kyung-Hee;Shin, Min-Sup;Kim, Boong-Nyun;Cho, Soo-Churl
- Journal of the Korean Academy of Child and Adolescent Psychiatry
- /
- v.14 no.2
- /
- pp.197-208
- /
- 2003
Objective：In this study, we hypothesized that deficit in processing rapid linguistic stimuli is at the heart of Reading Disability(RD) and deficit in response inhibition is at the heart of Attention Deficit/Hyperactivity(ADHD). We conducted experiments to identify the core cognitive characteristics of children either with RD or with ADHD or with both, using attentional tasks and phoneme differential tests. Method：In the study 1, 28 children with ADHD, 16 children with RD+ADHD were individually administered visual/auditory performance tests. Then, the differences of performance on attentional tasks between two groups were compared while IQs of two groups were controlled. In the study 2, 13 children with RD+ADHD/RD, 13 children with ADHD, and 13 normal children were administered computerized phoneme differential tests. Result：Visual attentional tasks did not distinguish an ADHD group from a RD+ADHD group. With auditory attentional tasks, however, the comorbid group showed significantly more difficulties, causing a large variance in reaction time. RD, RD+ADHD, and ADHD groups showed more errors in phoneme differential tests than a normal control group, and each group showed distinctive performance patterns. Discussion：An ADHD group had difficulty in response inhibition and sustained attention, and children who also had RD along with ADHD magnified the auditory attentional difficulties. Even though children with RD had more trouble with responding correctly to target stimuli, their responses were not significantly different from those of children with ADHD.
PDF

On the statistics of Korean Phonetic Dictionary - Basic Survey to make corpus of Korean Speech DB - (발음사전 표제어중의 음소의 통계적 성질-음성 DB용 단어선정을 위하여-)

Lee, Y.J.;Kim, K.T.;Jo, C.W.;Rhee, T.W.
- Proceedings of the KIEE Conference
- /
- 1987.07b
- /
- pp.1606-1609
- /
- 1987
Statistical information about spoken Korean was obtained. The data are the results of analyzing the Korean phonetic dictionary. This is one of the basic survey to make phoneme ballanced corpus of Korean Speech Data Base (KSDB).
PDF

Statistical Analysis of Korean Phonological Variations Using a Grapheme-to-phoneme System (발음열 자동 생성기를 이용한 한국어 음운 변화 현상의 통계적 분석)

이경님;정민화
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.7
- /
- pp.656-664
- /
- 2002
We present a statistical analysis of Korean phonological variations using a Grapheme-to-Phoneme (GPT) system. The GTP system used for experiments generates pronunciation variants by applying rules modeling obligatory and optional phonemic changes and allophonic changes. These rules are derived form morphophonological analysis and government standard pronunciation rules. The GTP system is optimized for continuous speech recognition by generating phonetic transcriptions for training and constructing a pronunciation dictionary for recognition. In this paper, we describe Korean phonological variations by analyzing the statistics of phonemic change rule applications for the 60,000 sentences in the Samsung PBS Speech DB. Our results show that the most frequently happening obligatory phonemic variations are in the order of liaison, tensification, aspirationalization, and nasalization of obstruent, and that the most frequently happening optional phonemic variations are in the order of initial consonant h-deletion, insertion of final consonant with the same place of articulation as the next consonants, and deletion of final consonant with the same place of articulation as the next consonant's, These statistics can be used for improving the performance of speech recognition systems.
PDF KSCI

Automatic Phoneme Generator based on Standard Korean Pronunciation (표준어 규정에 따른 한국어 음소열 자동생성기)

이도관;강미영;윤근수;이교운;권혁철
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.04c
- /
- pp.528-530
- /
- 2003
우리말에서 띄어쓰기와 버금갈 정도로 어려운 것이 우리말의 발음이다. 이에 실생활에서 혼란스럽게 사용되는 발음법과 그로 인해 올바른 발음의 선택에 대한 어려움을 덜어낼 수 있도록 표준어 규정의 표준 발음법에 따른 한국어 음소열 자동 생성기를 구현하여 교육용으로 쓸 수 있도록 하는 것이 이 논문의 목적이다.
PDF

Phoneme-based Recognition of Korean Speech Using HMM(Hidden Markov Model) and Genetic Algorithm (HMM과 GA를 이용한 한국어 음성의 음소단위 인식)

박준하;조성원
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1997.10a
- /
- pp.291-295
- /
- 1997
현재에 주로 개발되어 상용화가 시작되고 있는 음성인식 시스템의 대부분은 단어인식을 기분으로 하는 시스템으로 적용 단어수를 늘려줌으로서 인식범위를 늘일 수 있으나, 그에 따라 검색해야하는 단어수가 늘어남으로서 전체적인 시스템의 속도 및 성능이 저하되는 경향이 있다. 이러한 단점의 극복을 위하여 본 논문에서는 HMM(Hidden Markov Model)과 GA(Genetic Algorithm)를 이용한 한국어 음성의 음소단위 인식 시스템을 구현하였다. 음성 특징으로는 LPC Cepstrum 계수를 사용하였으며, 인식시는 인식대상이 되는 단어에 대하여 GA(Genetic Algorithm)을 통하여 각 음소를 분리하고, 음소단위로 학습된 HMM 파라미터를 적용하여 인식함으로써 각각의 음소별 가능하도록 하는 방법을 제안하였다.
PDF

A Revising Method using Phoneme Comparison for Databases with Korean Character Set (데이터베이스상의 한글 자모단위 비교를 통한 데이터 정정기법)

김대환;백두권
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.10a
- /
- pp.532-534
- /
- 2003
코드로써 관리되어있지 않은 데이터베이스 내의 다양한 속성들이 시간이 흐름에 따라 정보로써 가치를 갖게 되면서. 비코드성 한글 데이터의 정형화에 대한 요구가 증가하고 있다. 정형화에 있어 한글의 특수성 중에 하나는 한글자료의 경우 KSC5601, CP949등을 사용하여 음절단위의 문자셋을 사용하여 음절단위로 저장 관리한다. 그런데 입력 시정에서는 자판기등을 이용하여 음소단위로 데이터를 입력하면서 발생하는 오류 및 비정형 데이터의 유입의 문제 등을 내포하고 있다. 이러한 문제를 해결하기 위하여 데이터의 저장단위인 음절이 아닌 음소 단위의 비교를 통하여 데이터를 정정하는 기법을 제안하고자 한다.
PDF

Development of the algorithm for Korean vowel recognition (한국어 인식을 위한 알고리즘의 개발)

Ahn, Chang;Chin, Sang-Hyun;Rhee, Sang-Burm
- Proceedings of the KIEE Conference
- /
- 1988.07a
- /
- pp.620-623
- /
- 1988
A vowel is based on the recognition of a phoneme. Thus it is necessary for the programming of an algorithm to achieve the speech recognition in that case. In this paper, cepstrum is used for a voiced-unvoiced decision and speech parameters are extracted by linear prediction coding. Using these parameters, a speech understanding algorithm has been developed.
PDF

Search Result 331, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)