• Title/Summary/Keyword: Speech problem

검색결과 473건 처리시간 0.024초

A Study on Objective Speech Quality Measure under CDMA Telephone Networks Environment (CDMA 통신망에서의 객관적 음질 평가 척도에 관한 연구)

  • 김광수;김민정;석수영;정호열;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • 제2권4호
    • /
    • pp.53-58
    • /
    • 2001
  • In this paper to develop objective speech quality measure for CDMA telephone network environments, recent developed measures are investigated first. But those measures show low performances in CDMA telephone networks. To solve this problem, new objective speech quality measure adopting noise masking threshold is proposed and studied. To acquire better performance, scaled noise masking threshold calculation for speech signals is employed instead of conventional tone signals. To verify effectiveness of proposed method performance comparison experiments are carried out for CDMA telephone network speech databases, for the results proposed methods show improved performances compared to existing meaures.

  • PDF

Korean Part-Of-Speech Tagging by using Head-Tail Tokenization (Head-Tail 토큰화 기법을 이용한 한국어 품사 태깅)

  • Suh, Hyun-Jae;Kim, Jung-Min;Kang, Seung-Shik
    • Smart Media Journal
    • /
    • 제11권5호
    • /
    • pp.17-25
    • /
    • 2022
  • Korean part-of-speech taggers decompose a compound morpheme into unit morphemes and attach part-of-speech tags. So, here is a disadvantage that part-of-speech for morphemes are over-classified in detail and complex word types are generated depending on the purpose of the taggers. When using the part-of-speech tagger for keyword extraction in deep learning based language processing, it is not required to decompose compound particles and verb-endings. In this study, the part-of-speech tagging problem is simplified by using a Head-Tail tokenization technique that divides only two types of tokens, a lexical morpheme part and a grammatical morpheme part that the problem of excessively decomposed morpheme was solved. Part-of-speech tagging was attempted with a statistical technique and a deep learning model on the Head-Tail tokenized corpus, and the accuracy of each model was evaluated. Part-of-speech tagging was implemented by TnT tagger, a statistical-based part-of-speech tagger, and Bi-LSTM tagger, a deep learning-based part-of-speech tagger. TnT tagger and Bi-LSTM tagger were trained on the Head-Tail tokenized corpus to measure the part-of-speech tagging accuracy. As a result, it showed that the Bi-LSTM tagger performs part-of-speech tagging with a high accuracy of 99.52% compared to 97.00% for the TnT tagger.

The Effect of Rehabilitation in Stroke Patients and Factors Influencing Outcome and Length of Hospitalization (뇌졸중의 재활치료에 대한 고찰)

  • Choi, Keum-Sook;Kim, Seon-Hee;Son, Jin-Chul;Choi, Soon-Chul;Park, Joo-Hyun
    • Journal of Korean Physical Therapy Science
    • /
    • 제6권1호
    • /
    • pp.879-887
    • /
    • 1999
  • The purpose of this study was to know the state of rehabilitation treatment of stroke, to compare treatment with Bobath therapy or not, establish what factors have influence on treatment effect and hospitalization period and to be a great help for guide of treatment and education of patient and family We analyzed 87 stroke patients retrospectively for the patient's age, the subtype of diagnosis, the period at the start of treatment, the duration of treatment, the duration of hospitalization, the speech problem the co-morbid complication and the ambulatory function at discharge These patients visited the department of rehabilitation medicine, Holy Family Hospital, Catholic university of Korea from June 1993 to June 1998. The patients were classified into two groups. One group (47 patients) was treated by Bobath trerapy and the other (40 patients) was not. The results were as follow ; 1) The period at the start of treatment was 15.3 days and the duration of treatment was 32.4 days 2) The shorter the period at the start of treatment, the shorter the duration of admission 3) There was no significant difference between two groups for the duration of hospitalization, seventy two percent of patients with Bobath treatment was walked compared with 25% of patients without Bobath treatment was. 4) There was no relation between the speech problem and the duration of admission, but the group with no speech problem showed better results in ambulation than those with speech problem. On conclusion, as soon as possible early rehabilitation treatment of the stroke patients should be performed in order to reduce the duration of hospitalization. Special(or professonal) treatment with Bobath therapy show more improved funtional recovery than that without Bobath therapy. Therefore actualization of Bobath therapy is also required.

  • PDF

On a Performance Evaluation of the Pitch Alteration Techniques of speech waveform coding (피치 변경법의 성능평가)

  • Kim, Hong;Bae, Seong-Gyun;Jo, Wang-Rae;Bae, Myung-Jin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 한국음향학회 1994년도 제11회 음성통신 및 신호처리 워크샵 논문집 (SCAS 11권 1호)
    • /
    • pp.103-106
    • /
    • 1994
  • Generally we are used to apply waveform coding method obtaining the high quality synthesized speech. But we have to solve the problems, memory capacity and pitch alteration, for applying the waveform coding method to speech synthesis by rule. The former problem is conquered by improving the integrated semiconductor technology, but the latter problem remains. In this paper, we compare the methods that have proposed for pitch alteration in our laboratory until now. These methods are not change properties of vocal tract formants and only altered the pitch halving method, 1.14% for cepstrum analysis method, and 2.36% for hamonics compensated with the phase method.

  • PDF

Improvements on Phrase Breaks Prediction Using CRF (Conditional Random Fields) (CRF를 이용한 운율경계추성 성능개선)

  • Kim Seung-Won;Lee Geun-Bae;Kim Byeong-Chang
    • MALSORI
    • /
    • 제57호
    • /
    • pp.139-152
    • /
    • 2006
  • In this paper, we present a phrase break prediction method using CRF(Conditional Random Fields), which has good performance at classification problems. The phrase break prediction problem was mapped into a classification problem in our research. We trained the CRF using the various linguistic features which was extracted from POS(Part Of Speech) tag, lexicon, length of word, and location of word in the sentences. Combined linguistic features were used in the experiments, and we could collect some linguistic features which generate good performance in the phrase break prediction. From the results of experiments, we can see that the proposed method shows improved performance on previous methods. Additionally, because the linguistic features are independent of each other in our research, the proposed method has higher flexibility than other methods.

  • PDF

On the Frequency Domain Pitch Detection of Noise Corrupted Speech Signals -Minimizing the Effects of the F1 by the Spectral AMDF- (배경잡음하에서 주파수영역 피치검출에 관한 연구 -스펙트럼 AMDF에 의한 제 1포먼트 영향 제거법-)

  • Bae, Myung-Jin;Park, Chan-Sou;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • 제10권4호
    • /
    • pp.12-18
    • /
    • 1991
  • Detecting the fundamental frequency(Fo) of the speech signal is a problem in many speech applications. A problem of the pitch detection method in the frequency domain is occurred by the first formant and the background noise. Thus, in this paper, we proposed a pitch detection algorithm in the frequency domain that reduces the effects of the first formant and the background noise by the spectral AMDF function. Several computer simulation results showed that the proposed algorithm was very effective for fundamental frequency detection.

  • PDF

Recovery of Lost Speech Segments Using Incremental Subspace Learning

  • Huang, Jianjun;Zhang, Xiongwei;Zhang, Yafei
    • ETRI Journal
    • /
    • 제34권4호
    • /
    • pp.645-648
    • /
    • 2012
  • An incremental subspace learning scheme to recover lost speech segments online is presented. Our contributions in this work are twofold. First, the recovery problem is transformed into an interpolation problem of the time-varying gains via nonnegative matrix factorization. Second, incremental nonnegative matrix factorization is employed to allow online processing and track the evolution of speech statistics. The effectiveness of the proposed scheme is confirmed by the experiment results.

A Speaker Recognition Based on Strange Attractor with Vector Average (벡터 평균값을 갖는 스트레인지 어트랙터 기반 화자인식)

  • Kim, Tae-Sik
    • Speech Sciences
    • /
    • 제8권3호
    • /
    • pp.133-142
    • /
    • 2001
  • In the area of speech processing, raw signals used to be presented in 2D format and different kinds of algorithms use the format to solve their problems. However, such kinds of presentation methods have limitations to extract characteristics from the signal, even though the algorithms are quiet good. The basic reason is that not much information can be detected from the 2D signal. Strange attractor in the field of chaos theory provides the 3D presentation method. In the area of the recognition problem, signal construction method is very important because good features can be detected from a good shape of attractors. This paper discusses a new presentation method that can be used to construct strange attractor in a different way. Normal strange attractor uses time-delay idea while the new method uses time-delay and vector average. This method provides us good information to be applied to speaker recognition problem.

  • PDF

Research on Noise Reduction Algorithm Based on Combination of LMS Filter and Spectral Subtraction

  • Cao, Danyang;Chen, Zhixin;Gao, Xue
    • Journal of Information Processing Systems
    • /
    • 제15권4호
    • /
    • pp.748-764
    • /
    • 2019
  • In order to deal with the filtering delay problem of least mean square adaptive filter noise reduction algorithm and music noise problem of spectral subtraction algorithm during the speech signal processing, we combine these two algorithms and propose one novel noise reduction method, showing a strong performance on par or even better than state of the art methods. We first use the least mean square algorithm to reduce the average intensity of noise, and then add spectral subtraction algorithm to reduce remaining noise again. Experiments prove that using the spectral subtraction again after the least mean square adaptive filter algorithm overcomes shortcomings which come from the former two algorithms. Also the novel method increases the signal-to-noise ratio of original speech data and improves the final noise reduction performance.

DYNAMIC TIME WARPING METHOD AND ITS APPLICATION

  • Youn Sang-Youn;Kim Woo Youl
    • Journal of the military operations research society of Korea
    • /
    • 제17권1호
    • /
    • pp.105-129
    • /
    • 1991
  • Dynamic Time Warping(in short DTW) is a kind of sequence comparison method. It is widely used in human speech recognition. The timing difference between two speech patterns to be compared is removed by warping the time axes of the speech pattern by minimising the time-normalised distance between them. In the process of finding the minimum time-normalised distance. the efficient method is dynamic programming problem. This paper describes the concept of dynamic time warping method, mathematical formulation and an application.

  • PDF