• Title/Summary/Keyword: Speech signals

Search Result 498, Processing Time 0.028 seconds

State Encoding of Hidden Markov Linear Prediction Models

  • Krishnamurthy, Vikram;Poor, H.Vincent
    • Journal of Communications and Networks
    • /
    • v.1 no.3
    • /
    • pp.153-157
    • /
    • 1999
  • In this paper, we derive finite-dimensional non-linear fil-ters for optimally reconstructing speech signals in Switched Predic-tion vocoders, Code Excited Linear Prediction(CELP) and Differ-ential Pulse Code Modulation (DPCM). Our filter is an extension of the Hidden Markov filter.

  • PDF

Pitch Extraction of Speech Signals by the Harmonics analysis (고조파 분석에 의한 음성신호의 피치 검출)

  • Kim, Kee-Hee;Choi, Jung-Ah;Bae, Myung-Jin;Ann, Sou-Guil
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1610-1614
    • /
    • 1987
  • The harmonies of the fundamental frequency in speech signal make a minute line spectrum in frequency domain. In this paper, we propose a new algorithm to detect a pitch interval in voiced sound based on the fact that the number of harmonies can represent the period of the pitch in the time domain.

  • PDF

DSP Implementation of Speech Enhancement System Using Microphone Array with Adaptive Post-processing (적응 후처리 과정을 갖는 마이크로폰 배열을 이용한 잡음제거기의 DSP 구현)

  • 권홍석;김시호;배건성
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.413-416
    • /
    • 2002
  • In this paper, a speech enhancement system using microphone array with adaptive Post-Processing is implemented in real-lime with TMS320C6201 DSP. It consists of delay-and-sum beamformer and adaptive post-processing filters with NLMS (Normalized Least Mean Square) algorithm. THS1206 ADC is used for collection of 4-channel microphone signals. Sizes of program memory, data ROM and data RAM of the implemented system are 15,744, 748 and 47,540 bytes, respectively. Finally 21.839${\times}$106 clocks per second is required for real-time operation.

  • PDF

Design of Emotion Recognition Model Using fuzzy Logic (퍼지 로직을 이용한 감정인식 모델설계)

  • 김이곤;배영철
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2000.05a
    • /
    • pp.268-282
    • /
    • 2000
  • Speech is one of the most efficient communication media and it includes several kinds of factors about speaker, context emotion and so on. Human emotion is expressed in the speech, the gesture, the physiological phenomena(the breath, the beating of the pulse, etc). In this paper, the method to have cognizance of emotion from anyone's voice signals is presented and simulated by using neuro-fuzzy model.

  • PDF

Performance Improvement of Acoustic Echo Canceller Using Post-Processor (후처리기를 이용한 음향 반향 제거기의 성능향상)

  • 박장식;김현태;손경식
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.5
    • /
    • pp.35-43
    • /
    • 1999
  • In this paper, a new robust adaptive algorithm and a post-processing method are proposed to improve the performance of AEC without computational burden. Its step-size is normalized by the sum of the powers of the reference input signal and the desired signal. When the near-end speaker's speech and noise are applied into the microphone, the step-size becomes small and the misalignment of coefficients are reduced. To reduce the residual echoes, a new post-processing method, which is co-operated with the proposed noise-robust adaptive algorithm, is proposed in this paper. The method is based on the correlation of the desired signal and the estimation error signal. The residual echoes are attenuated as proportional to the correlation normalized with the power of desired signals. The normalized correlation plays a role as Wiener filter for residual echoes. In the double-talk situation, the estimation error signals, that are residual echoes, dominantly include the near-end speaker's speech and the normalized correlation closes to 1. Therefore, the near-end speaker's speech can be transmitted without being attenuated. When the desired signals consists of only the acoustic echoes, the residual echoes are mostly attenuated and canceled by the proposed post-processor. The computation of AEC using the proposed post-processor is comparable to NLMS algorithm.

  • PDF

Detection of Underwater Transient Signals Using Noise Suppression Module of EVRC Speech Codec (EVRC 음성부호화기의 잡음억제단을 이용한 수중 천이신호 검출)

  • Kim, Tae-Hwan;Bae, Keun-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.6
    • /
    • pp.301-305
    • /
    • 2007
  • In this paper, we propose a simple algorithm for detecting underwater transient signals on the fact that the frequency range of underwater transient signals is similar to audio frequency. For this, we use a preprocessing module of EVRC speech codec that is the standard speech codec of the mobile communications. If a signal is entered into EVRC noise suppression module, we can get some parameters such as the update flag, the energy of each channel, the noise suppressed signal, the energy of input signal, the energy of background noise, and the energy of enhanced signal. Therefore the energy of the enhanced signal that is normalized with the energy of the background noise is compared with the pre-defined detection threshold, and then we can detect the transient signal. And the detection threshold is updated using the previous value in the noisy period. The experimental result shows that the proposed algorithm has $0{\sim}4% error rate in the AWGN or the colored noise environment.

Analysis on Vowel and Consonant Sounds of Patent's Speech with Velopharyngeal Insufficiency (VPI) and Simulated Speech (구개인두부전증 환자와 모의 음성의 모음과 자음 분석)

  • Sung, Mee Young;Kim, Heejin;Kwon, Tack-Kyun;Sung, Myung-Whun;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.7
    • /
    • pp.1740-1748
    • /
    • 2014
  • This paper focuses on listening test and acoustic analysis of patients' speech with velopharyngeal insufficiency (VPI) and normal speakers' simulation speech. In this research, a set consisting of 50-words, vowels and single syllables is determined for speech database construction. A web-based listening evaluation system is developed for a convenient/automated evaluation procedure. The analysis results show the trend of incorrect recognition for VPI speech and the one for simulation speech are similar. Such similarity is also confirmed by comparing the formant locations of vowel and spectrum of consonant sounds. These results show that the simulation method for VPI speech is effective at generating the speech signals similar to actual VPI patient's speech. It is expected that the simulation speech data can be effectively employed for our future work such as acoustic model adaptation.

The Acoustic Analysis of Korean Read Speech - with respect to the prosodic phrasing - (한국어 낭독체 문장의 음향분석 -바람과 햇님의 운율구 생성을 중심으로-)

  • Sung Chuljae
    • Proceedings of the KSPS conference
    • /
    • 1996.02a
    • /
    • pp.157-172
    • /
    • 1996
  • This study aims to suggest some theoretical methodology for analysis of the prosodic patterns in Korean Read Speech. The engineering effort relevant to the phonetic study has focused to the importance of prosodic phrasing which may play a major role in analyzing the phonetic DB. Before establishing the prosodic phrase as the prosodic unit, we should describe the features of the boundary signal in a target sentence. With this in mind, the general characteristics of Read Speech and the ToBI(tones and Break Indices), which has been currently in vogue with respect to the prosodic labelling system were presented as the first step. The concrete analysis was carried out with the fable 'North Wind and the Sun' Korean version, where about 25 prosodic units were discriminated by perceptual approach for 5 subjects. Establishing various informations which can be used for deciding a boundary position systematically, we can proceed to the next, viz. acoustic analysis of prosodic unit. The most important which we primarily study for improving the naturalness of synthetic speech may be, at first, detecting the boundary signals in the speech file and accordingly reestablishment it within the raw text.

  • PDF

Performance Improvement of Speech Recognition Based on Independent Component Analysis (독립성분분석법을 이용한 음성인식기의 성능향상)

  • 김창근;한학용;허강인
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2001.06a
    • /
    • pp.285-288
    • /
    • 2001
  • In this paper, we proposed new method of speech feature extraction using ICA(Independent Component Analysis) which minimized the dependency and correlation among speech signals on purpose to separate each component in the speech signal. ICA removes the repeating of data after finding the axis direction which has the greatest variance in input dimension. We verified improvement of speech recognition ability with training and recognition experiments when ICA compared with conventional mel-cepstrum features using HMM. Also, we can see that ICA dealt with the situation of recognition ability decline that is caused by environmental noise.

  • PDF

A Study on TSIUVC Approximate-Synthesis Method using Least Mean Square (최소 자승법을 이용한 TSIUVC 근사합성법에 관한 연구)

  • Lee, See-Woo
    • The KIPS Transactions:PartB
    • /
    • v.9B no.2
    • /
    • pp.223-230
    • /
    • 2002
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involves a distortion of speech waveform in case coexist with a voiced and an unvoiced consonants in a frame. This paper present a new method of TSIUVC (Transition Segment Including Unvoiced Consonant) approximate-synthesis by using Least Mean Square. The TSIUVC extraction is based on a zero crossing rate and IPP (Individual Pitch Pulses) extraction algorithm using residual signal of FIR-STREAK Digital Filter. As a result, This method obtain a high Quality approximation-synthesis waveform by using Least Mean Square. The important thing is that the frequency signals in a maximum error signal can be made with low distortion approximation-synthesis waveform. This method has the capability of being applied to a new speech coding of Voiced/Silence/TSIUVC, speech analysis and speech synthesis.