• Title/Summary/Keyword: Signal to silence ratio

Search Result 14, Processing Time 0.023 seconds

Voiced/Unvoiced/Silence Classification of Speech Signal by Level Crossing and DPCM (Level Crossing과 DPCM을 사용한 유성음/무성음/묵음의 분류)

  • Kim, Jin-Young;Sung, Koeng-Mo
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1615-1618
    • /
    • 1987
  • This paper proposes new algorithm for classifying speech signal frame into voiced, unvoiced, silence frame, using the parameters extracted from time domain behavior of speech signal The prameters used in this paper are absolute magnitude, the sum of peaks lager than reference level (T-peak), the ratio of T-peak to absolute magnitude and the magnitude of signal outputs of DPCM. Using this parameters, speech signal is more easily classified into voiced/ unvoiced/silence frame.

  • PDF

Voice Activity Detection in Noisy Environment using Speech Energy Maximization and Silence Feature Normalization (음성 에너지 최대화와 묵음 특징 정규화를 이용한 잡음 환경에 강인한 음성 검출)

  • Ahn, Chan-Shik;Choi, Ki-Ho
    • Journal of Digital Convergence
    • /
    • v.11 no.6
    • /
    • pp.169-174
    • /
    • 2013
  • Speech recognition, the problem of performance degradation is the difference between the model training and recognition environments. Silence features normalized using the method as a way to reduce the inconsistency of such an environment. Silence features normalized way of existing in the low signal-to-noise ratio. Increase the energy level of the silence interval for voice and non-voice classification accuracy due to the falling. There is a problem in the recognition performance is degraded. This paper proposed a robust speech detection method in noisy environments using a silence feature normalization and voice energy maximize. In the high signal-to-noise ratio for the proposed method was used to maximize the characteristics receive less characterized the effects of noise by the voice energy. Cepstral feature distribution of voice / non-voice characteristics in the low signal-to-noise ratio and improves the recognition performance. Result of the recognition experiment, recognition performance improved compared to the conventional method.

Hybrid Commanding Delta Modulation with Silence Detection (묵음 검출 기능을 사용한 하이브리드 압신 델타 변조기)

  • 조동호;은종관
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.19 no.6
    • /
    • pp.84-90
    • /
    • 1982
  • In this paper we exploit the use of the intermittent property of speech to reduce the transmission rate or to increase signal-to-quantization noise ratio (SQNR) in coding speech by hybrid companding data modulation (HCDM). In this scheme we detect silence in speech by a speech/silence discriminator. HCDM coding is done only for speech portion. For silence that is detected in evert block of 5 ms, only the information indicating that the Since the HCDM coder transmits bina교 signal synchronously at a fixed rate, the use of a buffer and its efficient control is essential. By using the HCDM with silence detection in coding speech, we could improve SONR by as much as 6 dB over the conventional HCDM or reduce the transmission rate by one third of the HCDM rate.

  • PDF

Speech Compression by Non-uniform Sampling at the maxima and minima (극대 및 극소점에서의 비균일 표본화에 의한 음성압축)

  • Rheem, Jae-Yeol;Baek, Sung-Joon;Ann, Sou-Guil;Kim, Bum-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.4
    • /
    • pp.36-44
    • /
    • 1992
  • To reduce the redundancy within samples that resulted from uniform sampling method, nonuniform sampling or nonredundant-sample coding methods can be considered. But it is well-known that when conventional nonuniform sampling methods are applied directly to speech signal, the amount of data required is comparable to or more than that required by uniform sampling method like PCM. To overcome this problem, we consider properties of speech signal in the sense of perception, and suggest a nonuniform sampling method at the maxima and minima of speech wave. To analyze the performance of the suggested method, compression ratio is considered. We show that compression ratio can be improved by silence detection, which can't be implemented by conventional methods based on uniform sampling. As experimental results, compression ratios of 1.54 without silence detection and 2.88 with silence detection for 8kHz 8-bit PCM signals are obtained.

  • PDF

Cepstral Distance and Log-Energy Based Silence Feature Normalization for Robust Speech Recognition (강인한 음성인식을 위한 켑스트럼 거리와 로그 에너지 기반 묵음 특징 정규화)

  • Shen, Guang-Hu;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.4
    • /
    • pp.278-285
    • /
    • 2010
  • The difference between training and test environments is one of the major performance degradation factors in noisy speech recognition and many silence feature normalization methods were proposed to solve this inconsistency. Conventional silence feature normalization method represents higher classification performance in higher SNR, but it has a problem of performance degradation in low SNR due to the low accuracy of speech/silence classification. On the other hand, cepstral distance represents well the characteristic distribution of speech/silence (or noise) in low SNR. In this paper, we propose a Cepstral distance and Log-energy based Silence Feature Normalization (CLSFN) method which uses both log-energy and cepstral euclidean distance to classify speech/silence for better performance. Because the proposed method reflects both the merit of log energy being less affected with noise in high SNR and the merit of cepstral distance having high discrimination accuracy for speech/silence classification in low SNR, the classification accuracy will be considered to be improved. The experimental results showed that our proposed CLSFN presented the improved recognition performances comparing with the conventional SFN-I/II and CSFN methods in all kinds of noisy environments.

A Study on SNR Estimation of Continuous Speech Signal (연속음성신호의 SNR 추정기법에 관한 연구)

  • Song, Young-Hwan;Park, Hyung-Woo;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.383-391
    • /
    • 2009
  • In speech signal processing, speech signal corrupted by noise should be enhanced to improve quality. Usually noise estimation methods need flexibility for variable environment. Noise profile is renewed on silence region to avoid effects of speech properties. So we have to preprocess finding voice region before noise estimation. However, if received signal does not have silence region, we cannot apply that method. In this paper, we proposed SNR estimation method for continuous speech signal. The waveform which is stationary region of voiced speech is very correlated by pitch period. So we can estimate the SNR by correlation of near waveform after dividing a frame for each pitch. For unvoiced speech signal, vocal track characteristic is reflected by noise, so we can estimate SNR by using spectral distance between spectrum of received signal and estimated vocal track. Lastly, energy of speech signal is mostly distributed on voiced region, so we can estimate SNR by the ratio of voiced region energy to unvoiced.

IMBE Model Based SNR Estimation of Continuous Speech Signals (연속음성신호에서 IMBE 모델을 이용한 SNR 추정 연구)

  • Park, Hyung-Woo;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.2
    • /
    • pp.148-153
    • /
    • 2010
  • In speech signal processing, speech signal corrupted by noise should be enhanced to improve quality. Usually noise estimation methods need flexibility for variable environment. Noise profile is renewed on silence region to avoid effects of speech properties. So we have to preprocess finding voice region before noise estimation. However, if received signal does not have silence region, we cannot apply that method. In this paper, we proposed SNR estimation method for continuous speech signal. A Speech signal consists of Voice and Unvoiced Band in The MBE excitation model. And the energy of speech signal is mostly distributed on voiced region, so we can estimate SNR by the ratio of voiced region energy to unvoiced. We use the IMBE vocoder for the Voice or Unvoice band of segmented speech signal. Continuously we calculate the segmented SNR using that information and the energy of each band. And we estimate the SNR of continuous speech signal.

Adaptive noise cancellation algorithm reducing path misadjustment due to speech signal (음성신호로 인한 잡음전달경로의 오조정을 감소시킨 적응잡음제거 알고리듬)

  • 박장식;김형순;김재호;손경식
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.5
    • /
    • pp.1172-1179
    • /
    • 1996
  • General adaptive noise canceller(ANC) suffers from the misadjustment of adaptive filter weights, because of the gradient-estimate noise at steady state. In this paper, an adaptive noise cancellation algorithm with speech detector which is distinguishing speech from silence and adaptation-transient region is proposed. The speech detector uses property of adaptive prediction-error filter which can filter the highly correlated speech. To detect speech region, estimation error which is the output of the adaptive filter is applied to the adaptive prediction-error filter. When speech signal apears at the input of the adaptive prediction-error filter. The ratio of input and output energy of adaptive prediction-error filter becomes relatively lower. The ratio becomes large when the white noise appears at the input. So the region of speech is detected by the ratio. Sign algorithm is applied at speech region to prevent the weights from perturbing by output speech of ANC. As results of computer simulation, the proposed algorithm improves segmental SNR and SNR up to about 4 dBand 11 dB, respectively.

  • PDF

Voice Activity Detection with Run-Ratio Parameter Derived from Runs Test Statistic

  • Oh, Kwang-Cheol
    • Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.95-105
    • /
    • 2003
  • This paper describes a new parameter for voice activity detection which serves as a front-end part for automatic speech recognition systems. The new parameter called run-ratio is derived from the runs test statistic which is used in the statistical test for randomness of a given sequence. The run-ratio parameter has the property that the values of the parameter for the random sequence are about 1. To apply the run-ratio parameter into the voice activity detection method, it is assumed that the samples of an inputted audio signal should be converted to binary sequences of positive and negative values. Then, the silence region in the audio signal can be regarded as random sequences so that their values of the run-ratio would be about 1. The run-ratio for the voiced region has far lower values than 1 and for fricative sounds higher values than 1. Therefore, the parameter can discriminate speech signals from the background sounds by using the newly derived run-ratio parameter. The proposed voice activity detector outperformed the conventional energy-based detector in the sense of error mean and variance, small deviation from true speech boundaries, and low chance of missing real utterances

  • PDF

Performance Improvement of Adaptive Noise Cancellation Using a Speech Detector

  • Park, Jang-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.2E
    • /
    • pp.39-44
    • /
    • 1996
  • The performance of two-channel adaptive noise canceller is ofter degraded by the weights perturbation due to the speech signal. In this paper, an adaptive noise canceller employing a speech detector and two adaptation algorithms which are switched according to the speech detector is proposed. When highly correlated speech signal is detected, the tap weights of the adaptive filter are adapted by the sign algorithm. On the other hand, the weights are adapted by the NLMS algorithm when silence is detected or when the characteristics of the noise propagation channel is changed. The employed speech detector utilizes the power ratio of the input and the output of an adaptive linear prediction-error filter. According to the computer simulation, the proposed method yields better performance than conventional ones.

  • PDF