• Title/Summary/Keyword: Speech signals

Search Result 499, Processing Time 0.023 seconds

Implementation of Real-Time Adaptive Noise Cancellation System Using DSP Processor (DSP 프로세서를 이용한 실시간 ANC 시스템 구현에 관한 연구)

  • Lee Young Il;Choi Hong Sub
    • MALSORI
    • /
    • no.52
    • /
    • pp.121-132
    • /
    • 2004
  • This paper is aiming at real-time implementation of adaptive noise cancellation system using DSP processor. ACHARF algorithm, which guarantees stability and fast convergence by adaptive compensator, is used on this DSP system. For the experiments, TLV320AIC23 stereo CODEC of TI Inc. is used with TMS320C6413 DSP processor. Signals of primary input and reference input are obtained by two microphones. The primary input is the voice plus noise signal and the reference input is white noise or real noise. The experimental results show that ANC system using DSP processor with ACHARF is verified to be an effective speech enhancement method for various speech processing units.

  • PDF

Modified Generic Mode Coding Scheme for Enhanced Sound Quality of G.718 SWB (G.718 초광대역 코덱의 음질 향상을 위한 개선된 Generic Mode Coding 방법)

  • Cho, Keun-Seok;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.119-125
    • /
    • 2012
  • This paper describes a new algorithm for encoding spectral shape and envelope in the generic mode of G.718 super-wide band (SWB). In the G.718 SWB coder, generic mode coding and sinusoidal enhancement are used for the quantization of modified discrete cosine transform (MDCT)-based parameters in the high frequency band. In the generic mode, the high frequency band is divided into sub-bands and for every sub-band the most similar match with the selected similarity criteria is searched from the coded and envelope normalized wideband content. In order to improve the quantization scheme in high frequency region of speech/audio signals, the modified generic mode by the improvement of the generic mode in G.718 SWB is proposed. In the proposed generic mode, perceptual vector quantization of spectral envelopes and the resolution increase for spectral copy are used. The performance of the proposed algorithm is evaluated in terms of objective quality. Experimental results show that the proposed algorithm increases the quality of sounds significantly.

Vowel Recognition Using the Fractal Dimensioin (프랙탈 차원을 이용한 모음인식)

  • 최철영
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.364-367
    • /
    • 1994
  • In this paper, we carried out some experiments on the Korean vowel recognition using the fractal dimension of the speech signals. We chose the Mincowski-Bouligand dimensioni as the fractal dimension, and computed it using the morphological covering method. For our experiments, we used both the fractal dimension and the LPC cepstrum which is conventionally known to be one of the best parameters for speech recognition, and examined the usefulness of the fractal dimension. From the vowel recognition experiments under various consonant contexts, we achieved the vowel recognition error rats of 5.6% and 3.2% for the case with only LPC cepstrum and that with both LPC cepstrum and the fractal dimension, respectively. The results indicate that the incorporation of the fractal dimension with LPC cepstrum gies more than 40% reduction in recognition errors, and indicates that the fractal dimension is a useful feature parameter for speech recognition.

  • PDF

Stereo Vision Neural Networks with Competition and Cooperation for Phoneme Recognition

  • Kim, Sung-Ill;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.1E
    • /
    • pp.3-10
    • /
    • 2003
  • This paper describes two kinds of neural networks for stereoscopic vision, which have been applied to an identification of human speech. In speech recognition based on the stereoscopic vision neural networks (SVNN), the similarities are first obtained by comparing input vocal signals with standard models. They are then given to a dynamic process in which both competitive and cooperative processes are conducted among neighboring similarities. Through the dynamic processes, only one winner neuron is finally detected. In a comparative study, with, the average phoneme recognition accuracy on the two-layered SVNN was 7.7% higher than the Hidden Markov Model (HMM) recognizer with the structure of a single mixture and three states, and the three-layered was 6.6% higher. Therefore, it was noticed that SVNN outperformed the existing HMM recognizer in phoneme recognition.

The Effects of a Massage and Oro-facial Exercise Program on Spastic Dysarthrics' Lip Muscle Function

  • Hwang, Young-Jin;Jeong, Ok-Ran;Yeom, Ho-Joon
    • Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.55-64
    • /
    • 2004
  • This study was to determine the effects of a massage and oro-facial exercise program on spastic dysarthric patients' lip muscle function using an electromyogram (EMG). Three subjects with Spastic Dysarthria participated in the study. The surface electrodes were positioned on the Levator Labii Superior Muscle (LLSM), Depressor Labii Inferior Muscle (DLIM), and Orbicularis Oris Muscle (OOM). To examine lip muscle function improvement, the EMG signals were analyzed in terms of RMS (Root Mean Square) values and Median Frequency. In addition, the diadochokinetic movements and the rate of sentence reading were measured. The results revealed that the RMS values were decreased and the Median Frequency moved to a high frequency area. Diadochokinesis and sentence reading rates were improved.

  • PDF

A Speech Representation and Recognition Method using Sign Patterns (부호패턴에 의한 음성표현과 인식방법)

  • Kim Young Hwa;Kim Un Il;Lee Hee Jeong;Park Byung Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.5
    • /
    • pp.86-94
    • /
    • 1989
  • In this paper the method using a sign pattern( +,- ) of Mel-cepstrum coefficients as a new speech representation is proposed. Relatively stable patterns can be obtained for speech signals which has strong stationarity like vowels and nasals, and the phonemic difference according to the individuality of speakers can be absorbed without affecting characteristics of the phoneme. In this paper we show that the reduction of recognition procedure of phonemes and training procedure of phoneme models can be achieved through the representation of Korean phonemes using such a sign pattern.

  • PDF

Preemphasis of Speech Signals in the Estimation of Time Difference of Arrival with Two Microphones (마이크로폰 쌍을 이용한 음원의 도달시간차이 추정에서 음성신호의 프리엠퍼시스 영향 분석)

  • Kwon Hongseok;Kim Siho;Bae Keunsung
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.35-38
    • /
    • 2004
  • In this paper, we investigate and analyze the problems encountered in frame-based estimation of TDOA(Time Difference of Arrival) using CPSP function. Spectral leakage occurring in framing of a speech signal by a rectangular window makes estimation of CPSP spectrum inaccurate. Framing with a Hamming window to reduce the spectral leakage effect distorts the signal due to the different weighting at temporally same sample, which make the TDOA estimation using CPSP function inaccurate. In this paper, we solve this problem by reducing the dynamic range of the spectrum of a speech signal with preemphasis. Experimental results confirm that the framing of pre-emphasized microphone output with a rectangular window shows higher success ratio of TDOA estimation than any other framing methods.

  • PDF

Performance Comparison of Speech Recognition Using Body-conducted Signals in Noisy Environment (소음 환경에서 body-conducted 신호를 이용한 음성인식 성능 비교)

  • Choi Dae-Lim;Lee Kwang-Hyun;Lee Yong-Ju;Kim Chong-Kyo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.57-60
    • /
    • 2004
  • 본 논문에서는 음성정보기술산업지원센터(SiTEC)에서 현재 배포중인 고소음 환경 음성 DB를 이용하여 air-conducted 음성과 body-conducted 음성의 인식 성능을 비교 실험하였다. 소음 환경에서 일반적인 마이크로폰으로부터 수집된 air-conducted 음성은 잡음의 영향을 받기 쉬우며 이는 인식률을 저하시킨다. 반면에 진동 픽업 마이크로폰에서 수집된 body-conducted 음성은 소음에 보다 강인한 특성을 보인다. 이러한 특성에 근거하여 소음 환경에서 일반 다이나믹 마이크로폰 음성에 음질 개선 방법과 채널 보상 방법을 적용한 인식 결과와 3종류의 진동 픽업 마이크로폰에서 수집된 음성과의 인식 성능을 비교 분석하여 body-conducted 음성 인식 시스템의 환용 가능성을 살펴보았다.

  • PDF

A New Variable Bit Rate Scheme for Waveform Interpolative Coders (파형보간 코더에서 파라미터간 거리차를 이용한 가변비트율 기법)

  • Yang, Hee-Sik;Jeong, Sang-Bae;Hahn, Min-Soo
    • MALSORI
    • /
    • no.65
    • /
    • pp.81-91
    • /
    • 2008
  • In this paper, we propose a new variable bit-rate speech coder based on the waveform interpolation concept. After the coder extracted all parameters, the amounts of the distortions between the current and the predicted parameters which are estimated by extrapolation using past two parameters are measured for all parameters. A parameter would not be transmitted unless the distortion exceeds the preset threshold. At the decoder side, the non-transmitted parameter is reconstructed by extrapolation with past two parameters used to synthesize signals. In this way, we can reduce 26% of the total bit rate while retaining the speech quality degradation below 0.1 PESQ score.

  • PDF

On the Frequency Domain Pitch Detection of Noise Corrupted Speech Signals -Minimizing the Effects of the F1 by the Spectral AMDF- (배경잡음하에서 주파수영역 피치검출에 관한 연구 -스펙트럼 AMDF에 의한 제 1포먼트 영향 제거법-)

  • Bae, Myung-Jin;Park, Chan-Sou;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.4
    • /
    • pp.12-18
    • /
    • 1991
  • Detecting the fundamental frequency(Fo) of the speech signal is a problem in many speech applications. A problem of the pitch detection method in the frequency domain is occurred by the first formant and the background noise. Thus, in this paper, we proposed a pitch detection algorithm in the frequency domain that reduces the effects of the first formant and the background noise by the spectral AMDF function. Several computer simulation results showed that the proposed algorithm was very effective for fundamental frequency detection.

  • PDF