• Title/Summary/Keyword: Korean speech

Search Result 5,286, Processing Time 0.03 seconds

Visual Presentation of Connected Speech Test (CST)

  • Jeong, Ok-Ran;Lee, Sang-Heun;Cho, Tae-Hwan
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.26-37
    • /
    • 1998
  • The Connected Speech Test (CST) was developed to test hearing aid performance using realistic stimuli (Connected speech) presented in a background of noise with a visible speaker. The CST has not been investigated as a measure of speech reading ability using the visual portion of the CST only. Thirty subjects were administered the 48 test lists of the CST using visual presentation mode only. Statistically significant differences were found between the 48 test lists and between the 12 passages of the CST (48 passages divided into 12 groups of 4 lists which were averaged.). No significant differences were found between male and female subjects; however, in all but one case, females scored better than males. No significant differences were found between students in communication disorders and students in other departments. Intra- and inter-subject variability across test lists and passages was high. Suggestions for further research include changing the scoring of the CST to be more contextually based and changing the speaker for the CST.

  • PDF

A Reliability Study on the Auditory-perceptual Evaluation of Parkinsonian Dysarthria (파킨슨증으로 인한 마비말장애의 청지각적 평가에 대한 신뢰도 연구)

  • Kim, Hyang-Hee;Lee, Mi-Sook;Kim, Sun-Woo;Lee, Won-Yong
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.129-141
    • /
    • 2004
  • An auditory-perceptual evaluation has long been utilized in assessing dysarthric speech. The process involves subjective judgement and the results might vary depending on clinical experiences or training of listeners. This study aimed to investigate reliability of the auditory-perceptual evaluation of 22 multi -dimensional variables on 6 patients with Parkinsonian speech disorders. Listeners were divided into two groups: one consisted of 6 speech therapists with clinical experiences for three years or more, and the other 6 graduate students without any previous clinical background. The results showed that the former evaluated dysarthric speech with higher inter-rater and intra-rater reliabilities than the latter. Furthermore, such speech variables as 'precise consonant: 'speech intelligibility: and 'SMR regularity' were more influenced than others by clinical experiences. We, therefore, postulated that a reliable auditory-perceptual evaluation of dysarthric speech may require adequate amount of clinical training of listeners.

  • PDF

Two Simultaneous Speakers Localization using harmonic structure (하모닉 구조를 이용한 두 명의 동시 발화 화자의 위치 추정)

  • Kim, Hyun-Kyung;Lim, Sung-Kil;Lee, Hyon-Soo
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.121-124
    • /
    • 2005
  • In this paper, we propose a sound localization algorithm for two simultaneous speakers. Because speech is wide-band signal, there are many frequency sub-bands in that two speech sounds are mixed. However, in some sub-bands, one speech sound is more dominant than other sounds. In such sub-bands, dominant speech sounds are little interfered by other speech or noise. In speech sounds, overtones of fundamental frequency have large amplitude, and that are called 'Harmonic structure of speech'. Sub-bands inharmonic structure are more likely dominant. Therefore, the proposed localization algorithm is based on harmonic structure of each speakers. At first, sub-bands that belong to harmonic structure of each speech signal are selected. And then, two speakers are localized using selected sub-bands. The result of simulation shows that localization using selected sub-bands are more efficient and precise than localization methods using all sub-bands.

  • PDF

Correlation analysis of linguistic factors in non-native Korean speech and proficiency evaluation (비원어민 한국어 말하기 숙련도 평가와 평가항목의 상관관계)

  • Yang, Seung Hee;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.49-56
    • /
    • 2017
  • Much research attention has been directed to identify how native speakers perceive non-native speakers' oral proficiency. To investigate the generalizability of previous findings, this study examined segmental, phonological, accentual, and temporal correlates of native speakers' evaluation of L2 Korean proficiency produced by learners with various levels and nationalities. Our experiment results show that proficiency ratings by native speakers significantly correlate not only with rate of speech, but also with the segmental accuracies. The influence of segmental errors has the highest correlation with the proficiency of L2 Korean speech. We further verified this finding within substitution, deletion, insertion error rates. Although phonological accuracy was expected to be highly correlated with the proficiency score, it was the least influential measure. Another new finding in this study is that the role of pitch and accent has been underemphasized so far in the non-native Korean speech perception studies. This work will serve as the groundwork for the development of automatic assessment module in Korean CAPT system.

Some Prosodic Aspects of Read Speech and Dialogue in Korean (대화체와 낭독체의 운율에 관한 연구)

  • Park Jihye
    • MALSORI
    • /
    • no.43
    • /
    • pp.11-23
    • /
    • 2002
  • In this paper, speech style is divided into two - read speech and dialogue. In the experiment, read speech and dialogue use the same sentence to control discrepancy from different sentence. While the number of AP in read speech is less than in dialogue, the number of IP in read speech is more than in dialogue. The number of syllables which consist of AP is more various in dialogue. Intonational patterns of the first AP in IP make a difference. In dialogue, there is a pattern which has many high tones - LHH. The FO range in dialogue is wider than in read speech.

  • PDF

Design and Implementation of Korean Tet-to-Speech System (다이폰을 이용한 한국어 문자-음성 변환 시스템의 설계 및 구현)

  • 정준구
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.91-94
    • /
    • 1994
  • This paper is a study on the design and implementation of the Korean Tet-to-Speech system. In this paper, parameter symthesis method is chosen for speech symthesis method and PARCOR coeffient, one of the LPC analysis, is used as acoustic parameter, We use a diphone as synthesis unit, it include a basic naturalness of human speech. Diphone DB is consisted of 1228 PCM files. LPC synthesis method has defect that decline clearness of synthesis speech, during synthesizing unvoiced sound In this paper, we improve clearness of synthesized speech, using residual signal as ecitation signal of unvoiced sound. Besides, to improve a naturalness, we control the prosody of synthesized speech through controlling the energy and pitch pattern. Synthesis system is implemented at PC/486 and use a 70Hz-4.5KHz band pass filter for speech imput/output, amplifier and TMS320c30 DSP board.

  • PDF

Korean Broadcast News Transcription Using Morpheme-based Recognition Units

  • Kwon, Oh-Wook;Alex Waibel
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.1E
    • /
    • pp.3-11
    • /
    • 2002
  • Broadcast news transcription is one of the hardest tasks in speech recognition because broadcast speech signals have much variability in speech quality, channel and background conditions. We developed a Korean broadcast news speech recognizer. We used a morpheme-based dictionary and a language model to reduce the out-of·vocabulary (OOV) rate. We concatenated the original morpheme pairs of short length or high frequency in order to reduce insertion and deletion errors due to short morphemes. We used a lexicon with multiple pronunciations to reflect inter-morpheme pronunciation variations without severe modification of the search tree. By using the merged morpheme as recognition units, we achieved the OOV rate of 1.7% comparable to European languages with 64k vocabulary. We implemented a hidden Markov model-based recognizer with vocal tract length normalization and online speaker adaptation by maximum likelihood linear regression. Experimental results showed that the recognizer yielded 21.8% morpheme error rate for anchor speech and 31.6% for mostly noisy reporter speech.

Speech Intelligibility Analysis on the Vibration Sound of the Window Glass of a Conference Room (회의실 유리창 진동음의 명료도 분석)

  • Kim, Yoon-Ho;Kim, Hee-Dong;Kim, Seock-Hyun
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2006.11a
    • /
    • pp.150-155
    • /
    • 2006
  • Speech intelligibility is investigated on a conference room-window glass coupled system. Using MLS(Maximum Length Sequency) signal as a sound source, acceleration and velocity responses of the window glass are measured by accelerometer and laser doppler vibrometer. MTF(Modulation Transfer Function) is used to identify the speech transmission characteristics of the room and window system. STI(Speech Transmission Index) is calculated by using MTF and speech intelligibility of the room and the window glass is estimated. Speech intelligibilities by the acceleration signal and the velocity signal are compared and the possibility of the wiretapping is investigated. Finally, intelligibility of the conversation sound is examined by the subjective test.

  • PDF

Speech Intelligibility Analysis on the Vibration Sound of the Glass Window of a Conference Room (회의실 유리창 진동음의 음성 명료도 분석)

  • Kim, Hee-Dong;Kim, Yoon-Ho;Kim, Seock-Hyun
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.17 no.4 s.121
    • /
    • pp.363-369
    • /
    • 2007
  • The purpose of the study is to obtain acoustical information to prevent eavesdropping of the glass window. Speech intelligibility was investigated on the vibration sound detected from the glass window of a conference room. Objective test using speech transmission index(STI) was performed to estimate quantitatively the speech intelligibility. STI was determined based on tile modulation transfer function(MTF) of the room-glass window system. Using Maximum Length Sequency(MLS) signal as a sound source, impulse responses of the glass window and MTF were determined by signals from accelerometers and laser doppler vibrometer. Finally, speech intelligibility of the interior sound and window vibration were compared under different sound pressure levels and amplifier gains to confirm the effect of measurement condition on the speech intelligibility.

The Effect of the Disturbing Wave on the Speech Intelligibility of the Eavesdropping Sound of a Window Glass (교란파가 유리창 진동음의 음성명료도에 미치는 영향)

  • Kim, Seock-Hyun;Kim, Hee-Dong;Heo, Wook
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.17 no.9
    • /
    • pp.888-894
    • /
    • 2007
  • The speech sound is detected by the vibration measurement of the window glass. In this study, we investigate the effect of the disturbing waves by background noise and window shaker excitation on the speech intelligibility of the detected sound. Based upon Modulation Transfer Function(MTF), speech intelligibility of the sound is objectively estimated by Speech Transmission Index(STI) As the level of the disturbing wave varies, variation of the speech intelligibility is examined. Experimental result reveals how STI is influenced by the level and frequency characteristics of the disturbing wave. By using a customized window shaker for disturbing sound, we evaluate the efficiency and the frequency characteristics of the anti-eavesdropping system. The purpose of the study is to provide useful information to prevent the eavesdropping through the window glass.