• Title/Summary/Keyword: target utterances

Search Result 20, Processing Time 0.026 seconds

Building a Sentential Model for Automatic Prosody Evaluation

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.47-59
    • /
    • 2009
  • The purpose of this paper is to propose an automatic evaluation technique for the prosodic aspect of an English sentence uttered by Korean speakers learning English. The underlying hypothesis is that the consistency of the manual prosody scoring is reflected in an imaginary space of prosody evaluation model constructed out of the three physical properties of the prosody considered in this paper, namely: the fundamental frequency (F0) contour, the intensity contour, and the segmental durations. The evaluation proceeds first by building a prosody evaluation model for the sentence. For the creation of the model, utterances from native speakers of English and Korean learners for the target sentence are manually scored by either native teachers of English or Korean phoneticians in terms of their prosody. Multiple native utterances from the manual scoring are selected as the "model" native utterances against which all the other Korean learners' utterances as well as the model utterances themselves can be semi-automatically evaluated by comparison in terms of the three prosodic aspects [7]. Each learner utterance, when compared to the multiple model native utterances, produces multiple coordinates in a three-dimensional space of prosody evaluation, each axis of which corresponds to the three prosodic aspects. The 3D coordinates from all the comparisons form a prosody evaluation model for the particular sentence and the associated manual scores can display regions of particular scores. The model can then be used as a predictive model against which other Korean utterances of the target sentence can be evaluated. The model from a Korean phonetician appears to support the hypothesis.

  • PDF

Treatment Effect of a Modified Melodic Intonation Therapy (MMIT) in Korean Aphasics

  • Ko, Do-Heung;Jeong, Ok-Ran
    • Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.91-102
    • /
    • 1998
  • The present study attempted to modify the conventional Melodic Intonation Therapy (MIT) in three aspects: number of syllables of adjacent target utterances (ATU), melody patterns of ATU, and initial listening of melody and intoned speech with the eyes closed. The modified Melodic Intonation Therapy (MMIT) was applied to two severe Korean aphasics. The patients exhibited a severely nonfluent aphasia resulting from a left CVA(Cerebrovascular Accident). The purpose of the modification was to avoid perseveration and improve reflective listening skills. First, the treatment program avoided ATU with the same number of syllables. Second, four different patterns of melody were developed: rising type, falling type, V-type, and inverted V-type. One type of prosodic pattern was preceded and followed by another type of melody. These two variations were to decrease perseverative behaviors. Finally, the patients kept their eyes closed when the clinician played and hummed a target melody at the initial stage of the program in order to improve reflective listening skills. A single-subject alternating treatment design was used. The effects of MMIT were compared to the conventional MIT. Differing the number of syllables and the type of melodic patterns decreased perseverative behaviors and produced more correct names. The initial listening of the target melody with the patients' eyes closed seemed to increase their attentiveness and result in a more fluent production of target utterances. Probable reasons for the effectiveness of MMIT were discussed.

  • PDF

Target Speaker Speech Restoration via Spectral bases Learning (주파수 특성 기저벡터 학습을 통한 특정화자 음성 복원)

  • Park, Sun-Ho;Yoo, Ji-Ho;Choi, Seung-Jin
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.3
    • /
    • pp.179-186
    • /
    • 2009
  • This paper proposes a target speech extraction which restores speech signal of a target speaker form noisy convolutive mixture of speech and an interference source. We assume that the target speaker is known and his/her utterances are available in the training time. Incorporating the additional information extracted from the training utterances into the separation, we combine convolutive blind source separation(CBSS) and non-negative decomposition techniques, e.g., probabilistic latent variable model. The nonnegative decomposition is used to learn a set of bases from the spectrogram of the training utterances, where the bases represent the spectral information corresponding to the target speaker. Based on the learned spectral bases, our method provides two postprocessing steps for CBSS. Channel selection step finds a desirable output channel from CBSS, which dominantly contains the target speech. Reconstruct step recovers the original spectrogram of the target speech from the selected output channel so that the remained interference source and background noise are suppressed. Experimental results show that our method substantially improves the separation results of CBSS and, as a result, successfully recovers the target speech.

An Acoustic Study on the Voice Imitation(3) - Based on a professional voice imitator′s speech - (모방 발화의 음향음성학적 연구(3) -전문 성대 모사자의 자료를 중심으로-)

  • Ahn Byoung-seob;Park Mi-young
    • MALSORI
    • /
    • no.52
    • /
    • pp.1-14
    • /
    • 2004
  • In this study, we investigated acoustic characteristics of imitated utterances by a professional voice imitator, focusing on prosodic properties such as vowel formants and f0 distribution. To see the patterns of a voice imitation by a professional voice imitator, we compared the imitator's voice data with target speakers' voice data. The professional imitator, Mr. Bae produced utterances imitating the former President Kim's, the comedian Choi's, and the singer Bae's voices. Auditorily, the imitator was judged to imitate all the target speakers' voices successfully. However, acoustic examination showed that the imitator was better at imitating the singer Bae's voice in that the imitator's and the singer Bae's voices are more alike with respect to vowel formants and f0 distribution. We infer this is because the imitator's normal voice is very similar to the singer Bae's voice. On the other hand, the imitator's voice data showed that the patterns of vowel formants and f0 distribution found in the imitator's imitation voices of the other two target speakers were different from those of target speakers' voices.

  • PDF

Improvement of Prosody Transplantation Technology for English Prosody Education and Its Application (운율교육을 위한 운율이식기술 개선 방안 연구)

  • Yi, So-Pae
    • MALSORI
    • /
    • no.61
    • /
    • pp.49-62
    • /
    • 2007
  • This study focused on the improvement of prosody transplantation technology to be used for effective prosody education. Issues making the technology a less acceptable tool for prosody education were addressed. Instead of merely copying the target pitch onto a learner's utterances, the target pitch was resealed in semitone before the transplantation. In so doing, distortion of a signal was minimized and the transplanted utterance could have the quality of sound not different from the learner's utterances. Instead of manual transplantation, an automatic procedure was proposed to increase the reliability and the consistency of the outcome and enable real time processing. The perceptual performance of the automatic transplantation was evaluated by the perception experiment showing the automatic ransplantation was as good as the manual process.

  • PDF

Analysis of Communicative Features in an Excellent Elementary English Class Using COLT and TALOS (COLT와 TALOS 활용 동영상 분석으로 살펴본 우수 초등영어수업의 의사소통성 양상)

  • Yoo, Hee-yeon;Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.2
    • /
    • pp.269-279
    • /
    • 2018
  • The purpose of this research is to investigate how an elementary English class is presented in terms of communicative properties using COLT and TALOS because previous studies mainly used COLT. A lack of TALOS has shown on the previous studies. Also, this study takes a close look into whether the English class is communicative or not communicative since previous studies criticized in that elementary English classes are not communicative. For the purposes of this research, COLT part B and TALOS low-inference were used to analyze one elementary English class which had won the grand prize at English class contest. The result of this study revealed that the class is communicative in terms of high quantity and quality of students utterances, high ratio of students' discourse initiation, students' unpredictable information giving utterances and extension of utterances. Findings from this study revealed the good elementary English class characteristics of this class: students' participation, focus on affective atmosphere, students-directed activities, and unconscious internalization of target expressions through repetition.

Concept-based Translation System in the Korean Spoken Language Translation System (한국어 대화체 음성언어 번역시스템에서의 개념기반 번역시스템)

  • Choi, Un-Cheon;Han, Nam-Yong;Kim, Jae-Hoon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.8
    • /
    • pp.2025-2037
    • /
    • 1997
  • The concept-based translation system, which is a part of the Korean spoken language translation system, translates spoken utterances from Korean speech recognizer into one of English, Japanese and Korean in a travel planning task. Our system regulates semantic rather than the syntactic category in order to process the spontaneous speech which tends to be regarded as the one ungrammatical and subject to recognition errors. Utterances are parsed into concept structures, and the generation module produces the sentence of the specified target language. We have developed a token-separator using base-words and an automobile grammar corrector for Korean processing. We have also developed postprocessors for each target language in order to improve the readability of the generation results.

  • PDF

Korean Speaker Verification Using Speaker Adaptation Methods (화자 적응 기술을 이용한 한국어 화자 확인)

  • Choi Dong-Jin;Oh Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.139-142
    • /
    • 2006
  • Speaker verification systems can be implemented using speaker adaptation methods if the amount of speech available for each target speaker is too small to train the speaker model. This paper shows experimental results using well-known adaptation methods, namely Maximum A Posteriori (MAP) and Maximum Likelihood Linear Regression (MLLR). Experimental results using Korean speech show that MLLR is more effective than MAP for short enrollment utterances.

  • PDF

Korean Intonation Patterns from the Viewpoint of F0 Percentage Change (F0 변화율로 본 한국어 억양 패턴의 음향 특성)

  • Lee, Ji Yeon;Lee, Ho-Young
    • Phonetics and Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.123-130
    • /
    • 2013
  • Previous researches on Korean intonation have been mainly focused on $F_0$ target frequencies, $F_0$ slope, and the duration of intonation patterns. This study investigated Korean intonation patterns, both boundary and phrasal tones, in relation to the $F_0$ percentage change between pitch targets. We measured the percentage change between the pitch targets of both boundary and phrasal tones. Additionally, the $F_0$ change between the preceding pitch target and the first pitch target of the boundary tone and the $F_0$ targets of the sequence of two LH phrasal tones ('LH + LH') were also measured. Two phrasal tones, LHLH and HLH, were compared with 'LH + LH' and the 'HLH' in the LHLH pattern respectively. We found that the percentage change between pitch targets in the phrasal tone is fixed to some extent. This helped explain why the slope of the phrasal tone is closely related to the number of syllables and the duration of the phrasal tone as discussed in previous studies. Since we analyzed the intonation patterns with the utterances from a large speech corpus, the results of this paper are expected to be used in building a larger annotated corpus of Korean.

Speech Rhythm Metrics for Automatic Scoring of English Speech by Korean EFL Learners

  • Jang, Tae-Yeoub
    • MALSORI
    • /
    • no.66
    • /
    • pp.41-59
    • /
    • 2008
  • Knowledge in linguistic rhythm of the target language plays a major role in foreign language proficiency. This study attempts to discover valid rhythm features that can be utilized in automatic assessment of non-native English pronunciation. Eight previously proposed and two novel rhythm metrics are investigated with 360 English read speech tokens obtained from 27 Korean learners and 9 native speakers. It is found that some of the speech-rate normalized interval measures and above-word level metrics are effective enough to be further applied for automatic scoring as they are significantly correlated with speakers' proficiency levels. It is also shown that metrics need to be dynamically selected depending upon the structure of target sentences. Results from a preliminary auto-scoring experiment through a Multi Regression analysis suggest that appropriate control of unexpected input utterances is also desirable for better performance.

  • PDF