• Title/Summary/Keyword: continuous speech

Search Result 317, Processing Time 0.022 seconds

Literature Analysis on PROMPT Treatment (1984-2020) (프롬프트(PROMPT) 치료기법에 관한 문헌 분석(1984-2020년))

  • Kim, Wha-soo;Lee, Rio;Lee, Ji-woo
    • Journal of Digital Convergence
    • /
    • v.19 no.2
    • /
    • pp.447-456
    • /
    • 2021
  • This study analyzed 28 domestic and foreign studies related Prompts for Restructuring Oral Muscular Phonetic Targets treatment techniques from 1984 to 2020 to prepare basic data for the development of PROMPT intervention programs and examination tools. According to the analysis, continuous research has been conducted since 1984 when the prompt study was first started, and the method of research was 16 intervention studies, with the highest number of speech disorders, and the target age being 3 to 5 years old, the most frequently conducted for infancy. The treatment was the most frequent in the 16th sessions, and the activities were based on the Motor Speech Hierarchy(MSH), except for the subjects of the non-verbal autism spectrum disorder. According to the analysis of the dependent variables, 'speech production' was the most common, followed by 'speech motor control', 'articulation', and 'speech intelligibility' were highest. Combined with all these studies, it suggests that PROMPT, which are directly useful for exercise spoken word production, are effectively being used outside the country and that it is necessary to develop a PROMPT program that can be applied domestically, in Korea.

Post-operative Continuous Positive Airway Pressure (CPAP) Therapy in Velopharyngeal Insufficiency Patient (지속성 양압 치료법을 이용한 구개인두기능부전증의 치료)

  • Kim, Kyu Nam;Koh, Kyung Suck;Jung, Seung Eun;Ha, Seung Hee;Park, Mi Kyung
    • Archives of Craniofacial Surgery
    • /
    • v.11 no.2
    • /
    • pp.73-76
    • /
    • 2010
  • Purpose: There are several surgical methods for correcting a velopharyngeal insufficiency (VPI) but in some cases, it is not possible to achieve complete recovery of the velopharyngeal function. This paper introduces a new therapy for treating hypernasality without further surgery using continuous positive airway pressure (CPAP). Methods: CPAP therapy was applied to seven VPI patients for eight weeks from April of 2007 to September of 2009. All patients underwent palatoplasty for the cleft palate and six patients underwent palatal lengthening for VPI before CPAP therapy. A speech pathologist performed an auditory perceptual evaluation to evaluate the improvement in hypernasality after 8-week CPAP therapy. Results: Six patients showed an improvement in hypernasality after CPAP therapy according to the auditory perceptual evaluation. One patient with severe hypernasality responded to the early part of therapy but the hypernasality did not improve after therapy. Conclusion: CPAP therapy might be effective in reducing the hypernasality in patients with VPI by providing resistance training to strengthen the velopharyngeal closure muscles. In particular, CPAP therapy could be more effective for patients who show mild to moderate hypernasality after surgery.

The continuous or categorical effects for HH vs. HL and HH vs. LH in lexical pitch accent contrasts of Korean

  • Kim, Jungsun
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.53-65
    • /
    • 2014
  • The current research examines whether pitch contour shapes in North Kyungsang pitch accent contrasts provide a phonetic dimension for phonological discreteness in a mimicry task. Two pitch accent continua resynthesized were created for HH vs. HL and HH vs. LH. To confirm a phonetic dimension for accounting for pitch accent categories in North Kyungsang Korean, the mimicries of speakers of two dialects (i.e., North Kyungsang & South Cholla) were compared. One of the findings showed that, for North Kyungsang speakers, the range of mean f0 peak times was a phonetic dimension undergoing a continuous shift within a stimulus continuum for both HH vs. HL and HH vs. LH. On the other hand, for South Cholla speakers, there were no apparent shifts around categorical boundaries for either HH vs. HL or HH vs. LH. Regarding individual mimicries on f0 peak timing, there are many variations. For HH vs. LH, three North Kyungsang speakers showed a discrete pattern reflecting a shift in phonological categories, but for HH vs. HL, there was no such distinction showing a categorical shift, though there were statistically significant differences for two speakers. Interestingly, one of the North Kyungsang speakers showed a continuous phonetic dimension for both HH vs. HL and HH vs. LH. Lastly, the f0 valley timing did not exhibit a discrete or gradient phonetic dimension for speakers of either dialect. On the basis of these results, what is interesting is that the tonal target such as high tone in North Kyungsang pitch accent categories within the autosegmental-metrical (AM) theory may be realized within individual cognitive systems for representing the interaction of perception and production.

Noise Reduction using Spectral Subtraction in the Discrete Wavelet Transform Domain (이산 웨이브렛 변환영역에서의 스펙트럼 차감법을 이용한 잡음제거)

  • 김현기;이상운;홍재근
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.4
    • /
    • pp.306-315
    • /
    • 2001
  • In noise reduction method from noisy speech for speech recognition in noisy environments, conventional spectral subtraction method has a disadvantage which distinction of noise and speech is difficult, and characteristic of noise can't be estimated accurately. Also, noise reduction method in the wavelet transform domain has a disadvantage which loss of signal is generated in the high frequency domain. In order to compensate theme disadvantage, this paper propose spectral subtraction method in continuous wavelet transform domain which speech and non- speech intervals is distinguished by standard deviation of wavelet coefficient, and signal is divided three scales at different scale. The proposed method extract accurately characteristic of noise in order to apply spectral subtraction method by end detection and band division. The proposed method shows better performance than noise reduction method using conventional spectral subtraction and wavelet transform from viewpoint signal to noise ratio and Itakura-Saito distance by experimental.

  • PDF

Vector Quantizer Based Speaker Normalization for Continuos Speech Recognition (연속음성 인식기를 위한 벡터양자화기 기반의 화자정규화)

  • Shin Ok-keun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.8
    • /
    • pp.583-589
    • /
    • 2004
  • Proposed is a speaker normalization method based on vector quantizer for continuous speech recognition (CSR) system in which no acoustic information is made use of. The proposed method, which is an improvement of the previously reported speaker normalization scheme for a simple digit recognizer, builds up a canonical codebook by iteratively training the codebook while the size of codebook is increased after each iteration from a relatively small initial size. Once the codebook established, the warp factors of speakers are estimated by comparing exhaustively the warped versions of each speaker's utterance with the codebook. Two sets of phones are used to estimate the warp factors: one, a set of vowels only. and the other, a set composed of all the Phonemes. A Piecewise linear warping function which corresponds to the estimated warp factor is adopted to warp the power spectrum of the utterance. Then the warped feature vectors are extracted to be used to train and to test the speech recognizer. The effectiveness of the proposed method is investigated by a set of recognition experiments using the TIMIT corpus and HTK speech recognition tool kit. The experimental results showed comparable recognition rate improvement with the formant based warping method.

Effects of Continuous Speech Therapy in Patients with Non-fluent Aphasia Using kMIT (kMIT를 이용한 비유창성 실어증 환자 음성 언어의 치료효과 연구)

  • Lee Ju Hee;Ko Myun Hwan;Kim Hyun Gi;Hong Ki Hwan
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.16 no.2
    • /
    • pp.158-164
    • /
    • 2005
  • Melody intonation therepy (MIT) is to improve the linguistic aspects of the verbal utterance for aphasic patients utilizing the intact right brain. It is applied to the aphasic patients with good comprehension, poor fluency, and little available speech are thought to be ideal candidates. The purpose of the study was to investigate the effects of Korean Melody intonation therapy (kMIT) in patients with non-fluent aphasia. Five male non-fluent aphasic patients were participated in this study. Average ages were 49.9 years old. Each therapy took 45-50minutes once a week for six months. Aphasic Screen lest (RISS) was used to assess language parameter such as Auditory comprehension, oral expression, reading, writing and calculation ability before and after kMIT. Mean of Length Utterance, verbal intelligibility and articulation disorder were assessed also. Computerized Speech Lab was used to assess the acoustic characteristics of aphasic patients before and after kMIT. The results are as follows : 1) Auditory comprehension, oral expression, reading, writing and calculation ability of the subjects increased after UH'. However, only oral expression showed significant difference (p<0.05). 2) Mean of Length Utterance of five patients generally increased after Un. 3) After kMIT, verbal intelligibility increased and showed significant difference (p<0.05). 4) Misarticulation rate generally decreased after m. 5) Voice Onset Time of the alveolar lenis /t/ and velar lenis /k/ gradually decreased after kMIT. 6) However, intonation pattern were increased gradually in yes'no question after kMIT.

  • PDF

Packet Loss Concealment Algorithm Using Pitch Harmonic Motion Estimation and Adaptive Signal Scale Estimation (피치 하모닉 움직임 예측과 적응적 신호 크기 예측을 이용한 패킷 손실 은닉 알고리즘)

  • Kim, Tae-Ha;Lee, In-Sung
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.4
    • /
    • pp.247-256
    • /
    • 2021
  • In this paper, we propose a packet loss concealment (PLC) algorithm using pitch harmonic motion prediction and adaptive signal amplitude prediction and. The spectral motion prediction method divides the spectral motion of the previous usable frame into predetermined sub-bands to predict and restore the motion of the lost signal. In the proposed algorithm, the speech signal is classified into voiced and unvoiced sounds. In the case of voiced sounds, it is further divided into pitch harmonics using the pitch frequency to predict and restore the pitch harmonic motion of the lost frame, and for the unvoiced sound, the lost frame is restored using the spectral motion prediction method. When the continuous loss of speech frames occurs, a method of adjusting the gain using the least mean square (LMS) predictor is proposed. The performance of the proposed algorithm was evaluated through the objective evaluation method, PESQ (Perceptual Evaluation of Speech Quality) and was showed MOS 0.1 improvement over the conventional method.

1-Pass Semi-Dynamic Network Decoding Using a Subnetwork-Based Representation for Large Vocabulary Continuous Speech Recognition (대어휘 연속음성인식을 위한 서브네트워크 기반의 1-패스 세미다이나믹 네트워크 디코딩)

  • Chung Minhwa;Ahn Dong-Hoon
    • MALSORI
    • /
    • no.50
    • /
    • pp.51-69
    • /
    • 2004
  • In this paper, we present a one-pass semi-dynamic network decoding framework that inherits both advantages of fast decoding speed from static network decoders and memory efficiency from dynamic network decoders. Our method is based on the novel language model network representation that is essentially of finite state machine (FSM). The static network derived from the language model network [1][2] is partitioned into smaller subnetworks which are static by nature or self-structured. The whole network is dynamically managed so that those subnetworks required for decoding are cached in memory. The network is near-minimized by applying the tail-sharing algorithm. Our decoder is evaluated on the 25k-word Korean broadcast news transcription task. In case of the search network itself, the network is reduced by 73.4% from the tail-sharing algorithm. Compared with the equivalent static network decoder, the semi-dynamic network decoder has increased at most 6% in decoding time while it can be flexibly adapted to the various memory configurations, giving the minimal usage of 37.6% of the complete network size.

  • PDF

A Study on Variation and Determination of Gaussian function Using SNR Criteria Function for Robust Speech Recognition (잡음에 강한 음성 인식에서 SNR 기준 함수를 사용한 가우시안 함수 변형 및 결정에 관한 연구)

  • 전선도;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.7
    • /
    • pp.112-117
    • /
    • 1999
  • In case of spectral subtraction for noise robust speech recognition system, this method often makes loss of speech signal. In this study, we propose a method that variation and determination of Gaussian function at semi-continuous HMM(Hidden Markov Model) is made on the basis of SNR criteria function, in which SNR means signal to noise ratio between estimation noise and subtracted signal per frame. For proving effectiveness of this method, we show the estimation error to be related with the magnitude of estimated noise through signal waveform. For this reason, Gaussian function is varied and determined by SNR. When we test recognition rate by computer simulation under the noise environment of driving car over the speed of 80㎞/h, the proposed Gaussian decision method by SNR turns out to get more improved recognition rate compared with the frequency subtracted and non-subtracted cases.

  • PDF

Categorization and production in lexical pitch accent contrasts of North Kyungsang Korean

  • Kim, Jungsun
    • Phonetics and Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.1-7
    • /
    • 2018
  • Categorical production in language processing helps speakers to produce phonemic contrasts. This categorization and production is utilized for the production-based and imitation-based approach in the present study. Contrastive signals in speakers' speech reflect the shapes of boundaries with categorical characteristics. Signals that provide information about lexical pitch accent contrasts can introduce categorical distinctions for productive and cognitive selection. This experiment was conducted with nine North Kyungsang speakers for a production task and nine North Kyungsang speakers for an imitation task. The first finding of the present study is the rigidity of categorical production, which controls the boundaries of lexical pitch accent contrasts. The categorization of North Kyungsang speakers' production allows them to classify minimal pitch accent contrasts. The categorical production in imitation appeared in two clusters, representing two meaningful contrasts. The second finding of the present study is that there are individual differences in speakers' production and imitation responses. The distinctive performances of individual speakers showed a variety of curves. For the HL-LH patterns, the categorical production tended to be highly distinctive as compared to the other pitch accent patterns (HH-HL and HH-LH), showing that there are more continuous curves than categorical curves. Finally, the present study shows that, for North Kyungsang speakers, imitative production is the core type of categorical production for determining the existence of the lexical pitch accent system. However, several questions remain for defining that categorical production, which leads to ideas for future research.