• Title/Summary/Keyword: Korean speech

Search Result 5,286, Processing Time 0.031 seconds

Channel Coder Implementation and Performance Analysis for Speech Coding: Considering bit Importance of Speech Information-part III (음성 부호기용 채널 부호화기의 구현 및 성능 분석)

  • 강법주;김선영;김상천;김영식
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.4
    • /
    • pp.484-490
    • /
    • 1990
  • In speech coding scheme, because information bits have different error sensitivities over channel errors, the channel coder for combining with speech coding should be realized by the variable coding rate considering the bit importance of speech information bits. In realizing the 4 kbps channel coder for 12kbps speech, this paper have chosen the channel coding method by analyzing the hard-decision post-decoding error rate of RCPC(Rate Compatible Punctured Convolutional) codes and bit error sensitivity of 12 kbps speech. Under the coherent QPSK and Rayleigh fading channel, the performance analysis has showed that 10dB gain was obtained in speech SEGSNR by 4-level uneuqal error protection, which was compared with the caseof no channel coding at 7dB channel SNR.

  • PDF

An Experimental Study on Barging-In Effects for Speech Recognition Using Three Telephone Interface Boards

  • Park, Sung-Joon;Kim, Ho-Kyoung;Koo, Myoung-Wan
    • Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.159-165
    • /
    • 2001
  • In this paper, we make an experiment on speech recognition systems with barging-in and non-barging-in utterances. Barging-in capability, with which we can say voice commands while voice announcement is coming out, is one of the important elements for practical speech recognition systems. Barging-in capability can be realized by echo cancellation techniques based on the LMS (least-mean-square) algorithm. We use three kinds of telephone interface boards with barging-in capability, which are respectively made by Dialogic Company, Natural MicroSystems Company and Korea Telecom. Speech database was made using these three kinds of boards. We make a comparative recognition experiment with this speech database.

  • PDF

Effects of Parents-centered Speech Intervention Program in Children with Cochlear Implant (부모중심 언어중재가 인공와우이식 아동의 수용언어능력에 미치는 영향)

  • Lee, Eun-Kyoung;Seok, Dong-Il
    • Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.147-160
    • /
    • 2007
  • This study was aimed to evaluate effect of parents-centered speech intervention program on overall speech and language performances of children with cochlear implant. Ten pairs of mother and child were selected and assigned into two groups: intervention group(G1) and control group(G2). G1 included 5 children with cochlear implant and their mothers who joined the parents-centered program. G2 consisted of 5 children with cochlear implant and their mothers did not participate in the program. To evaluate their speech and language abilities, examination instruments(Preschool Language Scale, and Language Comprehension and Cognition Test) were analyzed. Performances of pre- and post-treatment were analysed by ANOVA procedure. The results were as follows: There were significant differences of speech and language performances between pre- and post-treatment in G1. But there were no significant differences in G2(therapists centered program). G1 showed better performances in language comprehension than G2. This study revealed that parents centered language intervention program would be effective for speech and language development for children with cochlear implant.

  • PDF

Experiment of VoIP Transmission with AMR Speech Codec in Wireless LAN (무선랜 환경에서 AMR 음성부호화기를 적용한 VoIP 전송 실험)

  • Shin, Hye-Jung;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.67-73
    • /
    • 2004
  • Packet loss, jitter, and delay in the Internet are caused mainly by the shortage of network bandwidth. It is due to queuing and routing process in the intermediate nodes of the packet network. In the Internet whose bandwidth is changing very rapidly in time depending on the number of users and data traffic, controlling the peak transmission bit-rate of a VoIP. system depending on the channel condition could be very helpful for making use of the available network bandwidth. Adapting packet size to the channel condition can reduce packet loss to improve the speech quality. It has been shown in [1] that a VoIP system with an AMR speech codec provides better speech quality than VoIP systems with fixed rate speech codecs. With the adaptive codec mode assignment. algorithm proposed in [1], in this paper, we performed the voice transmission experiments using the wireless LAN through the real Internet environment. Experimental results are analyzed and discussed with our findings.

  • PDF

Analysis of Mobile Application Trends for Speech and Language Therapy of Children with Disabilities in Korea (국내 장애 아동을 위한 언어치료용 모바일 어플리케이션 현황 분석)

  • Lee, Youngmee;Lee, Soobok;Sung, Minkyoung
    • Phonetics and Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.153-163
    • /
    • 2015
  • This study investigated the trends of mobile applications which were developed for prompting speech and language skills for children with disabilities, and analyzed the function and contents of these applications as a tool of speech and language therapy. For this analysis, twenty applications among 71 ones were selected according to the exclusion criteria. These applications were classified by the 8 using types of contents and analyzed the function of mobile applications by the revised mobile contents evaluation standard (ease of use, value of education, interest level, and interactivity). As a results, applications for augmentative and alternative communication were developed much more than any other types. And the ease of use got the highest score whereas the interest level got the lowest score in whole evaluation analysis. The result of this study would suggest way to evaluate applications for speech language therapy and to contribute to developing the contents and function of mobile applications aims to help children with disabilities improving their speech and language skills.

Optimization of State-Based Real-Time Speech Endpoint Detection Algorithm (상태변수 기반의 실시간 음성검출 알고리즘의 최적화)

  • Kim, Su-Hwan;Lee, Young-Jae;Kim, Young-Il;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.2 no.4
    • /
    • pp.137-143
    • /
    • 2010
  • In this paper, a speech endpoint detection algorithm is proposed. The proposed algorithm is a kind of state transition-based ones for speech detection. To reject short-duration acoustic pulses which can be considered noises, it utilizes duration information of all detected pulses. For the optimization of parameters related with pulse lengths and energy threshold to detect speech intervals, an exhaustive search scheme is adopted while speech recognition rates are used as its performance index. Experimental results show that the proposed algorithm outperforms the baseline state-based endpoint detection algorithm. At 5 dB input SNR for the beamforming input, the word recognition accuracies of its outputs were 78.5% for human voice noises and 81.1% for music noises.

  • PDF

Synthetic Speech Quality Improvement By Glottal parameter Interpolation - Preliminary study on open quotient interpolation in the speech corpus - (성대특성 보간에 의한 합성음의 음질향상 - 음성코퍼스 내 개구간 비 보간을 위한 기초연구 -)

  • Bae, Jae-Hyun;Oh, Yung-Hwa
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.63-66
    • /
    • 2005
  • For the Large Corpus based TTS the consistency of the speech corpus is very important. It is because the inconsistency of the speech quality in the corpus may result in a distortion at the concatenation point. And because of this inconsistency, large corpus must be tuned repeatedly One of the reasons for the inconsistency of the speech corpus is the different glottal characteristics of the speech sentence in the corpus. In this paper, we adjusted the glottal characteristics of the speech in the corpus to prevent this distortion. And the experimental results are showed.

  • PDF

A Model of English Part-Of-Speech Determination for English-Korean Machine Translation (영한 기계번역에서의 영어 품사결정 모델)

  • Kim, Sung-Dong;Park, Sung-Hoon
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.3
    • /
    • pp.53-65
    • /
    • 2009
  • The part-of-speech determination is necessary for resolving the part-of-speech ambiguity in English-Korean machine translation. The part-of-speech ambiguity causes high parsing complexity and makes the accurate translation difficult. In order to solve the problem, the resolution of the part-of-speech ambiguity must be performed after the lexical analysis and before the parsing. This paper proposes the CatAmRes model, which resolves the part-of-speech ambiguity, and compares the performance with that of other part-of-speech tagging methods. CatAmRes model determines the part-of-speech using the probability distribution from Bayesian network training and the statistical information, which are based on the Penn Treebank corpus. The proposed CatAmRes model consists of Calculator and POSDeterminer. Calculator calculates the degree of appropriateness of the partof-speech, and POSDeterminer determines the part-of-speech of the word based on the calculated values. In the experiment, we measure the performance using sentences from WSJ, Brown, IBM corpus.

  • PDF

Effects of Background Noises on Speech-related Variables of Adults who Stutter (배경소음상황에 따른 성인 말더듬화자의 발화 관련 변수 비교)

  • Park, Jin;Oh, Sunyoung;Jun, Je-Pyo;Kang, Jin Seok
    • Phonetics and Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.27-37
    • /
    • 2015
  • This study was mainly aimed at investigating on the effects of background noises (i.e., white noise, multi-speaker conversational babble) on stuttering rate and other speech-related measures (i.e., articulation rate, speech effort). Nine Korean-speaking adults who stutter participated in the study. Each of the participants was asked to read a series of passages under each of four experimental conditions (i.e., typical solo reading (TR), choral reading (CR), reading under white noise presented (WR), reading with multi-speaker conversational babble presented (BR). Stuttering rate was computed based on a percentage of syllables stuttered (%SS) and articulation rate was also assessed as another speech-related measure under each of the experimental conditions. To examine the amount of physical effort needed to read, the speech effort was measured by using the 9-point Speech Effort Self Rating Scale originally employed by Ingham et al. (2006). Study results showed that there were no significant differences among each of the passage reading conditions in terms of stuttering rate, articulation rate, and speech effort. In conclusion, it can be argued that the two different types of background noises (i.e., white noise and multi-speaker conversational babble) are not different in the extent to which each of them enhances fluency of adults who stutter. Self ratings of speech effort may be also useful in measuring speech-related variables associated with vocal changes induced under each of the fluency enhancing conditions.

End-to-end speech recognition models using limited training data (제한된 학습 데이터를 사용하는 End-to-End 음성 인식 모델)

  • Kim, June-Woo;Jung, Ho-Young
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.63-71
    • /
    • 2020
  • Speech recognition is one of the areas actively commercialized using deep learning and machine learning techniques. However, the majority of speech recognition systems on the market are developed on data with limited diversity of speakers and tend to perform well on typical adult speakers only. This is because most of the speech recognition models are generally learned using a speech database obtained from adult males and females. This tends to cause problems in recognizing the speech of the elderly, children and people with dialects well. To solve these problems, it may be necessary to retain big database or to collect a data for applying a speaker adaptation. However, this paper proposes that a new end-to-end speech recognition method consists of an acoustic augmented recurrent encoder and a transformer decoder with linguistic prediction. The proposed method can bring about the reliable performance of acoustic and language models in limited data conditions. The proposed method was evaluated to recognize Korean elderly and children speech with limited amount of training data and showed the better performance compared of a conventional method.