• Title/Summary/Keyword: PESQ

Search Result 84, Processing Time 0.022 seconds

VoIP Receiver Structure for Enhancing Speech Quality Based on Telematics (텔레메틱스 기반의 VoIP 음성 통화품질 향상을 위한 수신단 구조)

  • Kim, Hyoung-Gook;Seo, Kwang-Duk
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.11 no.3
    • /
    • pp.48-54
    • /
    • 2012
  • The quality of real-time voice communication over Internet Protocol networks based on telematics is affected by network impairments such as delays, jitters, and packet loss. To resolve this issue, this paper proposes a receiver-based enhancing method of VoIP speech quality. The proposed method enables users to deliver high-quality voice using playout control and signal reconstruction, which consists of concealment of lost packets, adaptive playout-buffer scheduling using active jitter estimation, and smooth interpolation between two signals in a transition region. The proposed algorithm achieves higher Perceptual Evaluation of Speech Quality (PESQ) values and low buffering delay than the reference algorithm.

A Selection Method of Reliable Codevectors using Noise Estimation Algorithm (잡음 추정 알고리즘을 이용한 신뢰성 있는 코드벡터 조합의 선정 방법)

  • Jung, Seungmo;Kim, Moo Young
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.7
    • /
    • pp.119-124
    • /
    • 2015
  • Speech enhancement has been required as a preprocessor for a noise robust speech recognition system. Codebook-based Speech Enhancement (CBSE) is highly robust in nonstationary noise environments compared with conventional noise estimation algorithms. However, its performance is severely degraded for the codevector combinations that have lower correlation with the input signal since CBSE depends on the trained codebook information. To overcome this problem, only the reliable codevector combinations are selected to be used to remove the codevector combinations that have lower correlation with input signal. The proposed method produces the improved performance compared to the conventional CBSE in terms of Log-Spectral Distortion (LSD) and Perceptual Evaluation of Speech Quality (PESQ).

Global Soft Decision Using Probabilistic Outputs of Support Vector Machine for Speech Enhancement (SVM의 확률 출력을 이용한 새로운 Global Soft Decision 기반의 음성 향상 기법)

  • Jo, Q-Haing;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.2
    • /
    • pp.75-79
    • /
    • 2008
  • In this paper, we propose a novel speech enhancement technique using global soft decision (GSD) based on the probabilistic outputs of support vector machine (SVM). Generally, speech enhancement algorithms applied soft decision gain modification and noise power estimation have bettor performance than those employing hard decision. Especially, global speech absence probability (GSAP), which is known as an effective measure of the speech absence in each frame, has been adopted to SD-based speech enhancement methods. For this reason, we introduce a new GSAP estimated from the probabilistic output of SVM using sigmoid function. The performance of the proposed algorithm is evaluated by the PESQ and MOS test under various noise environments and yields better results compared with the conventional GSD scheme.

Improved Global-Soft Decision Incorporating Second-Order Conditional MAP for Speech Enhancement (음성향상을 위한 2차 조건 사후 최대 확률기법 기반 Global Soft Decision)

  • Kum, Jong-Mo;Chang, Joon-Hyuk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.6C
    • /
    • pp.588-592
    • /
    • 2009
  • In this paper, we propose a novel method to improve the performance of the global soft decision which is based on the second-order conditional maximum a posteriori (CMAP). Conventional global soft decision scheme has an disadvantage in that the speech absence probability adjusted by a fixed-parameter was sensitive to the various noise environments. In proposed approach using the second-order CMAP, speech absence probability value is more flexible which exploit not only the current observation but also the speech activity decisions in the previous two frames. Experimental results show that the proposed improved global soft decision method based on second-order conditional MAP yields better results compared to the conventional global soft decision technique with the performance criteria of the ITU-T P. 862 perceptual evaluation of speech quality (PESQ).

Low-Delay LSF FEC Technique Robust in Lossy VoIP Environment (VoIP 손실 환경에 강인한 저지연 LSF FEC 기법)

  • Yang, Hae-Yong;Lee, Kyung-Hoon;Hwang, In-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.6
    • /
    • pp.687-695
    • /
    • 2002
  • Media-specific FEC techniques, suggested to confront with VoIP speech packet loss, improve speech quality at the expense of generating additional one-frame delay. In this paper, we suggest new media-specific FEC, i.e, LSF FEC technique which is able to improve speech quality with much shortened additional delay. In the proposed technique, the LSF parameters of the future frame are utilized to recover a lost packet. To evaluate performance of the proposed technique, we use ITU-T G.723.1 and G.729 Codec and apply Gilbert packet loss model and estimate MOS per every packet loss rate using PESQ speech quality estimation algorithm. The proposed technique has effect of shortening delay over from 6.5ms to 27ms compared with existing media-specific FEC techniques. Simulation results for comparison of reconstructed speech quality show this novel technique improves the MOS over 0.1 in practical lossy environment of 5 % packet loss rate.

Transient Noise Reduction in Speech Signal Utilizing a Long-term Predictor (장구간 예측 필터를 이용한 음성 신호에서의 돌발 잡음 제거)

  • Choi, Min-Seok;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.1
    • /
    • pp.29-38
    • /
    • 2012
  • This paper presents a transient noise reduction system in a speech signal. The proposed transient noise reduction system utilizes a median filter to reduce the transient noise. Since the median filter can distort speech during the noise reduction, a long-term prediction (LTP) filter is adopted as a pre-processor to minimize speech distortion. The speech information preserved by the LTP filter is re-synthesized after reducing the noise. This paper verifies the weakness of a linear prediction (LP) filter and the superiority of the LTP filter for preserving the speech component in transient noise presence environment. Applying the proposed system, the signal-to-noise ratio (SNR) of output is improved by 8dB in both speech and noise presence region, and PESQ score is increased by 1 point comparing with noisy input.

Real-time implementation of the 2.4kbps EHSX Speech Coder Using a $TMS320C6701^TM$ DSPCore ($TMS320C6701^TM$을 이용한 2.4kbps EHSX 음성 부호화기의 실시간 구현)

  • 양용호;이인성;권오주
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.7C
    • /
    • pp.962-970
    • /
    • 2004
  • This paper presents an efficient implementation of the 2.4 kbps EHSX(Enhanced Harmonic Stochastic Excitation) speech coder on a TMS320C6701$^{TM}$ floating-point digital signal processor. The EHSX speech codec is based on a harmonic and CELP(Code Excited Linear Prediction) modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. In this paper, we represent the optimization methods to reduce the complexity for real-time implementation. The complexity in the filtering of a CELP algorithm that is the main part for the EHSX algorithm complexity can be reduced by converting program using floating-point variable to program using fixed-point variable. We also present the efficient optimization methods including the code allocation considering a DSP architecture and the low complexity algorithm of harmonic/pitch search in encoder part. Finally, we obtained the subjective quality of MOS 3.28 from speech quality test using the PESQ(perceptual evaluation of speech quality), ITU-T Recommendation P.862 and could get a goal of realtime operation of the EHSX codec.c.

Robust Speech Enhancement Based on Soft Decision Employing Spectral Deviation (스펙트럼 변이를 이용한 Soft Decision 기반의 음성향상 기법)

  • Choi, Jae-Hun;Chang, Joon-Hyuk;Kim, Nam-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.222-228
    • /
    • 2010
  • In this paper, we propose a new approach to noise estimation incorporating spectral deviation with soft decision scheme to enhance the intelligibility of the degraded speech signal in non-stationary noisy environments. Since the conventional noise estimation technique based on soft decision scheme estimates and updates the noise power spectrum using a fixed smoothing parameter which was assumed in stationary noisy environments, it is difficult to obtain the robust estimates of noise power spectrum in non-stationary noisy environments that spectral characteristics of noise signal such as restaurant constantly change. In this paper, once we first classify the stationary noise and non-stationary noise environments based on the analysis of spectral deviation of noise signal, we adaptively estimate and update the noise power spectrum according to the classified noise types. The performances of the proposed algorithm are evaluated by ITU-T P. 862 perceptual evaluation of speech quality (PESQ) under various ambient noise environments and show better performances compared with the conventional method.

An Improved Speech Absence Probability Estimation based on Environmental Noise Classification (환경잡음분류 기반의 향상된 음성부재확률 추정)

  • Son, Young-Ho;Park, Yun-Sik;An, Hong-Sub;Lee, Sang-Min
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.7
    • /
    • pp.383-389
    • /
    • 2011
  • In this paper, we propose a improved speech absence probability estimation algorithm by applying environmental noise classification for speech enhancement. The previous speech absence probability required to seek a priori probability of speech absence was derived by applying microphone input signal and the noise signal based on the estimated value of a posteriori SNR threshold. In this paper, the proposed algorithm estimates the speech absence probability using noise classification algorithm which is based on Gaussian mixture model in order to apply the optimal parameter each noise types, unlike the conventional fixed threshold and smoothing parameter. Performance of the proposed enhancement algorithm is evaluated by ITU-T P.862 PESQ (perceptual evaluation of speech quality) and composite measure under various noise environments. It is verified that the proposed algorithm yields better results compared to the conventional speech absence probability estimation algorithm.

A Study on the Performance of Noise Reduction using Multi-Microphones for Digital Hearing Aids (디지털 보청기를 위한 다중 마이크로폰을 이용한 잡음제거 성능 연구)

  • Kang, Hyun-Deok;Song, Young-Rok;Lee, Sang-Min
    • Journal of IKEEE
    • /
    • v.14 no.1
    • /
    • pp.47-54
    • /
    • 2010
  • In this study, we analyzed the reduction of noise in a noise environment using 2, 3, 4 or 5 microphones in digital hearing aids. In order to be able to use this in actual digital hearing aids, we made the experiment microphone set similar to the behind-the-ear type (BTE) and then recorded the signal accordingly, with each situation. With the recorded signals, we reduced the noise in each signal by a noise reduction algorithm using multi-microphones. As a result, in the case of By comparing the SNR (Signal to Noise Ratio) and PESQ (Perceptual Evaluation of Speech) measurements, before and after the noise reduction, the results showed that the improvement in performance was highest when three or four microphones were used. Generally, when two or more microphones were used, we found that as the number of microphones increased there was an increase in performance.