• Title/Summary/Keyword: Speech Transition

Search Result 100, Processing Time 0.021 seconds

Speech Transition Detection and approximate-synthesis Method for Speech Signal Compression and Recovery (음성신호 압축 및 복원을 위한 음성 천이구간 검출과 근사합성 방식)

  • Lee, Kwang-Seok;Kim, Bong-Gi;Kang, Seong-Soo;Kim, Hyun-Deok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.763-767
    • /
    • 2008
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech qualify in case coexist with a voiced and an unvoiced consonants in a frame. So, We proposed TS(Transition Segment) including unvoiced consonant searching and extraction method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This research present a new method of TS approximate-synthesis by using Least Mean Square and frequency band division. As a result, this method obtain a high quality approximation-synthesis waveforms within TS by using frequency information of 0.547kHz below and 2.813kHz above. The important thing is that the maximum error signal can be made with low distortion approximation-synthesis waveform within TS. This method has the capability of being applied to a new speech coding of Voiced/Silence/TS, speech analysis and speech synthesis.

  • PDF

Speech Signal Compression and Recovery Using Transition Detection and Approximate-Synthesis (천이구간 추출 및 근사합성에 의한 음성신호 압축과 복원)

  • Lee, Kwang-Seok;Lee, Byeong-Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.2
    • /
    • pp.413-418
    • /
    • 2009
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech qualify in case coexist with a voiced and an unvoiced consonants in a frame. So, We proposed TS(Transition Segment) including unvoiced consonant searching and extraction method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This research present a new method of TS approximate-synthesis by using Least Mean Square and frequency band division. As a result, this method obtain a high qualify approximation-synthesis waveforms within TS by using frequency information of 0.547kHz below and 2.813kHz above. The important thing is that the maximum error signal can be made with low distortion approximation-synthesis waveform within TS. This method has the capability of being applied to a new speech coding of Voiced/Silence/TS, speech analysis and speech synthesis.

Robust Speech Decoding Using Channel-Adaptive Parameter Estimation.

  • Lee, Yun-Keun;Lee, Hwang-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1E
    • /
    • pp.3-6
    • /
    • 1999
  • In digital mobile communication system, the transmission errors affect the quality of output speech seriously. There are many error concealment techniques using a posteriori probability which provides information about any transmitted parameter. They need knowledge about channel transition probability as well as the 1st order Markov transition probability of codec parameters for estimation of transmitted parameters. However, in applications of mobile communication systems, the channel transition probability varies depending on nonstationary channel characteristics. The mismatch of designed channel transition probability of the estimator to actual channel transition probability degrades the performance of the estimator. In this paper, we proposed a new parameter estimator which adapts to the channel characteristics using short time average of maximum a posteriori probabilities(MAPs). The proposed scheme, when applied to the LSP parameter estimation, performed better than the conventional estimator which do not adapt to the channel characteristics.

  • PDF

Speech Recognition in Noisy environment using Transition Constrained HMM (천이 제한 HMM을 이용한 잡음 환경에서의 음성 인식)

  • Kim, Weon-Goo;Shin, Won-Ho;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.2
    • /
    • pp.85-89
    • /
    • 1996
  • In this paper, transition constrained Hidden Markov Model(HMM) in which the transition between states occur only within prescribed time slot is proposed and the performance is evaluated in the noisy environment. The transition constrained HMM can explicitly limit the state durations and accurately de scribe the temporal structure of speech signal simply and efficiently. The transition constrained HMM is not only superior to the conventional HMM but also require much less computation time. In order to evaluate the performance of the transition constrained HMM, speaker independent isolated word recognition experiments were conducted using semi-continuous HMM with the noisy speech for 20, 10, 0 dB SNR. Experiment results show that the proposed method is robust to the environmental noise. The 81.08% and 75.36% word recognition rates for conventional HMM was increased by 7.31% and 10.35%, respectively, by using transition constrained HMM when two kinds of noises are added with 10dB SNR.

  • PDF

On Detecting the Transition Regions of Phonemes by Using the Asymmetrical Rate of Speech Waveforms (음성파형의 비대칭율을 이용한 음소의 전이구간 검출)

  • Bae, Myung-Jin;Lee, Eul-jae;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.9 no.4
    • /
    • pp.55-65
    • /
    • 1990
  • To recognize continued speech, it is necessary to segment the connected acoustic signal into phonetic units, In this paper, as a parameter to detect transition regions in continued speech, we propose a new asymmetrical rate. The suggested rate represents a change rate of magnitude of speech signals. As comparing this rate with other rate in adjacent frame, the state of the frame can be distinguished between steady state and transient state.

  • PDF

On a detecting the transition segments of speech signal by energ approximatio degree of the synchronized pitch (피치 동기된 에너지 유사도에 의한 음성신호의 전이구간 검출)

  • 김종득;박형빈;김대호;배명진
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.603-606
    • /
    • 1998
  • In a large number of words and the continued speech recognition system using a phoneme as teh recognition unit, it is necessary to segment processing. In this paper, a normalized AMDF new method. The suggested parameter represents a degree of sharpness at valley point. This method can detect the speech segment between the steady state and transient region to the continued speech without a prior information of speech signal.

  • PDF

A Study on Estimation of Formants and Articulatory Motion Trajectories using RLSL Adaptive Linear Prediction Filter (RLSL 적응선형예측필터를 이용한 형성음 및 조음운동궤적 추정에 관한 연구)

  • 김동준;송영수
    • Journal of Biomedical Engineering Research
    • /
    • v.14 no.1
    • /
    • pp.1-8
    • /
    • 1993
  • In this study, the extractions of formants and articulatory motion trajectories for Korean complex vowels are performed by using the RLSL adaptive linear prediction filter. This enables us to extract accurate spectrum in transition of speech signal. This study shows that the RLSL algorithm is superior to the Levinson algorithm, specially in transition part of speech.

  • PDF

An Alteration Rule of Formant Transition for Improvement of Korean Demisyllable Based Synthesis by Rule (한국어 반음절단위 규칙합성의 개선을 위한 포만트천이의 변경규칙)

  • Lee, Ki-Young;Choi, Chang-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.98-104
    • /
    • 1996
  • This paper propose the alteraton rule to compensate a formant trasition of several connected vowels for improving an unnatural synthesized continuous speech which is concatenated by each demisyllable without coarticulated formant transition for use in dmisyllable based synthesis by rule. To fullfill each formant transition part, the database of 42 stationary vowels which are segmented from the stable part of each vowels is appended to the one of Korean demisyllables, and the resonance circuit used in formant synthesis is employed to change the formant frequency of speech signals. To evaluate the synthesied speech by this rule, we carried out the alteration rule for connected vowels of the synthesized speech based on demisyllable, and compare spectrogram and MOS tested scores with the original and the demisyllable based synthesized speech without this rule. The result shows that this proposed rule can synthesize the more natural speech.

  • PDF

A User friendly Remote Speech Input Unit in Spontaneous Speech Translation System

  • Lee, Kwang-Seok;Kim, Heung-Jun;Song, Jin-Kook;Choo, Yeon-Gyu
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.784-788
    • /
    • 2008
  • In this research, we propose a remote speech input unit, a new method of user-friendly speech input in speech recognition system. We focused the user friendliness on hands-free and microphone independence in speech recognition applications. Our module adopts two algorithms, the automatic speech detection and speech enhancement based on the microphone array-based beamforming method. In the performance evaluation of speech detection, within-200msec accuracy with respect to the manually detected positions is about 97percent under the noise environments of 25dB of the SNR. The microphone array-based speech enhancement using the delay-and-sum beamforming algorithm shows about 6dB of maximum SNR gain over a single microphone and more than 12% of error reduction rate in speech recognition.

  • PDF

Low Rate Speech Coding Using the Harmonic Coding Combined with CELP Coding (하모닉 코딩과 CELP방법을 이용한 저 전송률 음성 부호화 방법)

  • 김종학;이인성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.26-34
    • /
    • 2000
  • In this paper, we propose a 4kbps speech coder that combines the harmonic vector excitation coding with time-separated transition coding. The harmonic vector excitation coding uses the harmonic excitation coding in the voiced frame and uses the vector excitation coding with the structure of analysis-by-synthesis in the unvoiced frame, respectively. But two mode coding method is not effective for transition frame mixed in voiced and unvoiced signal and a new method beyond using unvoiced/voiced mode coding is needed. Thus, we designed a time-separated transition coding method for transition frame in which a voiced/unvoiced decision algorithm separates unvoiced and voiced duration in a frame, and harmonic-harmonic excitation coding and vector-harmonic excitation coding method is selectively used depending on the previous frame U/V decision. In the decoder, the voiced excitation signals are generated efficiently through the inverse FFT of harmonic magnitudes and the unvoiced excitation signals are made by the inverse vector quantization. The reconstructed speech signal are synthesized by the Overlap/Add method.

  • PDF