Search | Korea Science

On the Simple Speaker Verification System Using Tolerance Interval Analysis Without Background Speaker Models (Tolerance Interval Analysis를 이용한 배경화자 없는 간단한 화자인증시스템에 관한 연구)

Choi, Hong-Sub
- MALSORI
- /
- no.56
- /
- pp.147-158
- /
- 2005
In this paper, we are focused to develop the simplified speaker verification algorithm without background speaker models, which will be adopted in the portable speaker verification system equipped in portable terminals such as mobile phone and PMP. According to the tolerance interval analysis, the population of someone's speaker model can be represented by a suitable number of selected independent samples of speaker model. So we can make the representative speaker model and threshold under the specified confidence level and coverage. Using proposed algorithm with the number of samples is 40, the experiments show that the false rejection rate is $3.0\%$ and the false acceptance rate $4.3\%$, worth comparing to conventional method's results, $5.4\%\;and\;5.5\%$, respectively. Next step of research will be on the suitable adaptation methods to overcome speech variation problems due to aging effect and operating environments.
PDF

A Situation-Based Dialogue Management with Dialogue Examples (대화 예제를 이용한 상황 기반 대화 관리 시스템)

Lee, Cheong-Jae;Jung, Sang-Keun;Lee, Geun-Bae
- MALSORI
- /
- no.56
- /
- pp.185-194
- /
- 2005
In this paper, we present POSSDM (POSTECH Situation-Based Dialogue Manager) for a spoken dialogue system using a new example and situation-based dialogue management technique for effective generation of appropriate system responses. Spoken dialogue system should generate cooperative responses to smoothly control dialogue flow with the users. We introduce a new dialogue management technique incorporating dialogue examples and situation-based rules for EPG (Electronic Program Guide) domain. For the system response inference, we automatically construct and index a dialogue example database from dialogue corpus, and the best dialogue example is retrieved for a proper system response with the query from a dialogue situation including a current user utterance, dialogue act, and discourse history. When dialogue corpus is not enough to cover the domain, we also apply manually constructed situation-based rules mainly for meta-level dialogue management.
PDF

Performance Improvement of Korean Connected Digit Recognition Using Various Discriminant Analyses (다양한 변별분석을 통한 한국어 연결숫자 인식 성능향상에 관한 연구)

Song Hwa Jeon;Kim Hyung Soon
- MALSORI
- /
- no.44
- /
- pp.105-113
- /
- 2002
In Korean, each digit is monosyllable and some pairs are known to have high confusability, causing performance degradation of connected digit recognition systems. To improve the performance, in this paper, we employ various discriminant analyses (DA) including Linear DA (LDA), Weighted Pairwise Scatter LDA WPS-LDA), Heteroscedastic Discriminant Analysis (HDA), and Maximum Likelihood Linear Transformation (MLLT). We also examine several combinations of various DA for additional performance improvement. Experimental results show that applying any DA mentioned above improves the string accuracy, but the amount of improvement of each DA method varies according to the model complexity or number of mixtures per state. Especially, more than 20% of string error reduction is achieved by applying MLLT after WPS-LDA, compared with the baseline system, when class level of DA is defined as a tied state and 1 mixture per state is used.
PDF

Classification of Asthma Disease Using Thoracic Data (흉부음 데이터를 이용한 천식 질환 판별)

Moon In-Seob;Choi Hyoung-Ki;Lee Chul-Hee;Park Ki-Young;Kim Chong-Kyo
- MALSORI
- /
- no.49
- /
- pp.135-144
- /
- 2004
In this paper, we make a study of classification normal from abnormal - normal, asthma through analysis of thoracic sound to take use thoracic sound detection system. Thoracic sound detection system has a function to store thoracic sound and analyze the data. The wave shape of thoracic sound is similar to noise and is systematically generated by inhalation and exhalation breathing, therefore, in this paper, to classify asthma sound in thoracic sound, we could discriminate between normal and abnormal case using level crossing rate(LCR) and spectrogram energy rate.
PDF

Speech Rhythm Metrics for Automatic Scoring of English Speech by Korean EFL Learners

Jang, Tae-Yeoub
- MALSORI
- /
- no.66
- /
- pp.41-59
- /
- 2008
Knowledge in linguistic rhythm of the target language plays a major role in foreign language proficiency. This study attempts to discover valid rhythm features that can be utilized in automatic assessment of non-native English pronunciation. Eight previously proposed and two novel rhythm metrics are investigated with 360 English read speech tokens obtained from 27 Korean learners and 9 native speakers. It is found that some of the speech-rate normalized interval measures and above-word level metrics are effective enough to be further applied for automatic scoring as they are significantly correlated with speakers' proficiency levels. It is also shown that metrics need to be dynamically selected depending upon the structure of target sentences. Results from a preliminary auto-scoring experiment through a Multi Regression analysis suggest that appropriate control of unexpected input utterances is also desirable for better performance.
PDF

On Detcdting the Steady State Segments of Speech Waveform by using the Normalized AMDF (규준화된 AMDF 이용한 음성파형의안정상태 구간검출)

Bae, Myung-Jin;Kim, Ul-Je;Ahn, Sou-Guil
- The Journal of the Acoustical Society of Korea
- /
- v.10 no.3
- /
- pp.44-50
- /
- 1991
To recognize continued speech, it is necessary to segment the connected acoustic signal into phonetic units. In this paper, as a parameter to detect the transition regions in continued speech, we propose a new noramlized AMDF. The suggested parameter represents a change rate of magnitude of speech signals. As comparing this value with the adjactent frames value the state of the frames can be distinguished as a level between the steady state and transient state.
PDF

Semantic-oriented Error Correction for Spoken Query Processing (음성 질의 처리를 위한 의미 기반 오류 수정)

Jeong Minwoo;Kim Byeongchang;Lee Gary Geunbae
- Proceedings of the KSPS conference
- /
- 2003.10a
- /
- pp.153-156
- /
- 2003
Voice input is often required in many new application environments such as telephone-based information retrieval, car navigation systems, and user-friendly interfaces, but the low success rate of speech recognition makes it difficult to extend its application to new fields. Popular approaches to increase the accuracy of the recognition rate have been researched by post-processing of the recognition results, but previous approaches were mainly lexical-oriented ones in post error correction. We suggest a new semantic-oriented approach to correct both semantic level and lexical errors, which is also more accurate for especially domain-specific speech error correction. Through extensive experiments using a speech-driven in-vehicle telematics information application, we demonstrate the superior performance of our approach and some advantages over previous lexical-oriented approaches.
PDF

A Study on the Vowel lengthening and a Morphophonological Interpretatipon for its function (홀소리 길이의 늘어짐(Vowel lengthening)의 기능 및 형태음운론적 해석)

Kim, Chong-Dok
- Proceedings of the KSPS conference
- /
- 2005.04a
- /
- pp.9-13
- /
- 2005
The aim of this paper is to analyze the vowel lengthening in Korean, whose function is distinctive in the word's level. In this paper, I examined two acoustic parameters : vowel length and formants(F1 and F2) to distinguish or to identify the long vowel and his short correspondant, for exemple, /a:/ and /a/. According to the results of experimental analysis and to the discussion on the vowel length's relation and its influence to Korean phonological system, I considered a vowel lengthening as a prosodeme, so as a prosodic element in Korean phonological system.
PDF

Automatic Synthesis Method Using Prosody-Rich Database (대용량 운율 음성데이타를 이용한 자동합성방식)

김상훈
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.08a
- /
- pp.87-92
- /
- 1998
In general, the synthesis unit database was constructed by recording isolated word. In that case, each boundary of word has typical prosodic pattern like a falling intonation or preboundary lengthening. To get natural synthetic speech using these kinds of database, we must artificially distort original speech. However, that artificial process rather resulted in unnatural, unintelligible synthetic speech due to the excessive prosodic modification on speech signal. To overcome these problems, we gathered thousands of sentences for synthesis database. To make a phone level synthesis unit, we trained speech recognizer with the recorded speech, and then segmented phone boundaries automatically. In addition, we used laryngo graph for the epoch detection. From the automatically generated synthesis database, we chose the best phone and directly concatenated it without any prosody processing. To select the best phone among multiple phone candidates, we used prosodic information such as break strength of word boundaries, phonetic contexts, cepstrum, pitch, energy, and phone duration. From the pilot test, we obtained some positive results.
PDF

The Effects of Onomatopoeia and Mimetic Word Productive Training Program on Auditory Performance and Vocal Development in Children with Cochlear Implants (의성어.의태어 산출 프로그램이 인공와우 착용 아동의 청능 및 발성 발달에 미치는 효과)

Kim, Yu-Kyung;Seok, Dong-Il
- Speech Sciences
- /
- v.11 no.2
- /
- pp.51-67
- /
- 2004
The objective of this study was to investigate the effects in auditory performance and vocal development of Onomatopoeia and Mimetic Word Productive Training Program in prelingually deafened children with cochlear implantation. The effects were measured with Lip-profile (Listening progress profile: LiP), the number of utterances, vocal developmental level and phonetic inventory. Subjects were four children with cochlear implants who were able to detect speech sounds and environmental sounds. The Onomatopoeia and Mimetic word Productive Training Program was made up of 3 steps with 24 Onomatopoeia and Mimetic words. This study was pre and post design. The results of the study were as follows: First, after Onomatopoeia and Mimetic word Productive Training Program was treated, LiP score was significantly higher. Second, after this program was treated, the number of utterances and emergence of both canonical and postcanonical utterances were increased. Emergence of vowel and consonant Features were increased and diversified. In conclusion, Onomatopoeia and Mimetic Word Productive Training Program appeared to facilitate efficient auditory performance and vocal development.
PDF

Search Result 113, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)