• Title/Summary/Keyword: PESQ

Search Result 84, Processing Time 0.019 seconds

Correlation Analysis of PESQ and MOS Evaluation for HMM-based Synthetic Korean Speech (HMM 기반의 한국어 합성음에 대한 PESQ 및 MOS 평가의 상관도 분석)

  • Lin, Cang-Song;Bae, Keun-Sung
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.71-75
    • /
    • 2010
  • The PESQ is an objective speech quality evaluation measure that is known to have a high correlation with a subjective speech quality measure such as MOS. To examine whether it could be useful as an objective quality measure of synthetic speech, we carried out both subjective evaluation tests with MOS and DMOS and an objective evaluation test with PESQ for HMM-based Korean synthetic speech signals and analyzed the correlation between them. Experimental results have shown that the PESQ has correlations of 0.87 with MOS and 0.92 with DMOS. It means that the PESQ holds much promise for evaluating the quality of synthetic Korean speech.

  • PDF

Evaluation of a signal segregation by FDBM (FDBM의 음원분리 성능평가)

  • Lee, Chai-Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.12
    • /
    • pp.1793-1802
    • /
    • 2013
  • Various approaches for sound source segregation have been proposed. Among these approaches, frequency domain binaural model(FDBM) has the advantages of low computational load and effective howling cancellation. A binaural hearing assistance system based on FDBM has been proposed. This system can enhance desired signal based on the directivity information. Although FDBM has been evaluated in terms of signal-to-noise ratio (SNR) and coherence function, the evaluation results do not always agree with the human impressions. These evaluation methods provide physical measures, and do not take account of perceptual aspect of human being. Considering a binaural hearing assistance system as a one of major applications, the quality of segregated sound should keep level enough. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and Perceptual Evaluation of Speech Quality(PESQ), to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and PESQ, to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions.

Conversational Quality Measurement System for Mobile VoIP Speech Communication (모바일 VoIP 음성통신을 위한 대화음질 측정 시스템)

  • Cho, Jae-Man;Kim, Hyoung-Gook
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.10 no.4
    • /
    • pp.71-77
    • /
    • 2011
  • In this paper, we propose a conversational quality measurement (CQM) system for providing the objective QoS of high quality mobile VoIP voice telecommunication. For measuring the conversational quality, the VoIP telecommunication system is implemented in two smart phones connected with VoIP. The VoIP telecommunication system consists of echo cancellation, noise reduction, speech encoding/decoding, packet generation with RTP (Real-Time Protocol), jitter buffer control and POS (Play-out Schedule) with LC (loss Concealment). The CQM system is connected to a microphone and a speaker of each smart phone. The voice signal of each speaker is recorded and used to measure CE (Conversational Efficiency), CS (Conversational Symmetry), PESQ (Perceptual Evaluation of Speech Quality) and CE-CS-PESQ correlation. We prove the CQM system by measuring CE, CS and PESQ under various SNR, delay and loss due to IP network environment.

VoIP QoS Network Modeling using Network Simulator (네트워크 시뮬레이터를 이용한 PESQ 기반 VoIP QoS 보장 네트워크 조건 모델링)

  • Kim, Sun-mi;Kim, Jong-seong;An, Suk-hwan
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.5 no.3
    • /
    • pp.110-116
    • /
    • 2012
  • As rapid development of the Internet, there are increasing interests about quality of service of VoIP using the Internet network and established PSTN. There was some study going on to improve the quality of service from the point of view of supplier's mainly. However recently, the demand for setting up the objective standards that service provider can offer to customers is needed as the interests from customers has been going up. In this study, we analysed the results from the test using PESQ which is a parameter of quality of service. we offer the conditions of network which guarantees the minimum quality of phone call using VoIP.

Two-Channel Noise Reduction Using Beamforming and DOA-Based Masking (빔포밍 및 DOA 기반의 마스킹을 이용한 2채널 잡음제거)

  • Kim, Youngil;Jeong, Sangbae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.1
    • /
    • pp.32-40
    • /
    • 2013
  • In this paper, we propose a multi-channel speech enhancement algorithm using beamforming and direction-of-arrival (DOA)-based masking. The proposed algorithm enhances noisy speech basically by the linearly constrained minimum variance (LCMV) algorithm and then a mel-scale Wiener filter designed using DOA-based masking is applied to remove still remaining noises. To improve the performance, we optimize the learning rate of the adaptive filters in LCMV and the DOA threshold to detect target speech spectrum. As performance indices, the perceptual evaluation of speech quality (PESQ) score and output SNRs are measured. Experimantal results show that the proposed algorithm outperforms the conventional LCMV beamformer by 0.09 in PESQ score and 5.75 dB in output SNR, respectively.

Speech enhancement system using the multi-band coherence function and spectral subtraction method (다중 주파수 밴드 간섭함수와 스펙트럼 차감법을 이용한 음성 향상 시스템)

  • Oh, Inkyu;Lee, Insung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.4
    • /
    • pp.406-413
    • /
    • 2019
  • This paper proposes a speech enhancement method through the process of combining the gain function with spectrum subtraction method in the two microphone array with close spacing. A speech enhancement method that uses a gain function estimated by the SNR (Signal-to Noise Ratio) based on the multi frequency band coherence function causes the performance degradation in high correlation between input noises of two channels. A new speech enhancement method is proposed where the weighted gain function is used by combining the gain function from the spectral subtraction. The performance evaluation of the proposed method was shown by comparison with PESQ (Perceptual Evaluation of Speech Quality) value which is an objective quality evaluation test provided by the ITU-T (International Telecommunications Union Telecommunication). In the PESQ tests, the maximum 0.217 of PESQ value is improved in the various background noise environments.

Design of the Noise Suppressor Using the Perceptual Model and Wavelet Packet Transform (인지 모델과 웨이블릿 패킷 변환을 이용한 잡음 제거기 설계)

  • Kim, Mi-Seon;Park, Seo-Young;Kim, Young-Ju;Lee, In-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.325-332
    • /
    • 2006
  • In this paper. we Propose the noise suppressor with the Perceptual model and wavelet packet transform. The objective is to enhance speech corrupted colored or non-stationary noise. If corrupted noise is colored. subband approach would be more efficient than whole band one. To avoid serious residual noise and speech distortion, we must adjust the Wavelet Coefficient Threshold (WCT). In this Paper. the subband is designed matching with the critical band and WCT is adapted noise masking threshold (NMT) and segmental signal to noise ratio (seg_SNR). Consequently. it has similar Performance with EVRC in PESQ-MOS. But it's better than wavelet packet transform using universal threshold about 0.289 in PESQ-MOS. The important thing is that it's more useful than EVRC in coded speech. In coded speech. PESQ-MOS is higher than EVRC about 0.23.

An Objective Estimation for Simulating of Asymmetrical Auditory Filter of the Hearing Impaired According to Hearing Loss Degree (난청인의 난청 정도에 따른 비대칭 청각 필터 구현의 객관적 평가)

  • Joo, S.I.;Jeon, Y.Y.;Song, Y.R.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.3 no.1
    • /
    • pp.27-34
    • /
    • 2009
  • Hearing impaired person's hearing loss has personally various shape, so existing symmetrical auditory filter of frequency band method wasn't properly simulated the hearing impaired person's various hearing loss shape. The shapes of auditory filter are asymmetrical different with each center frequency and each input level. Hearing impaired person which has hearing loss was differently changed with that of normal hearing people and it has different value for speech of quality through auditory filter. In this study, the asymmetrical auditory filter was simulated and then some tests to estimate the filter's performance objectively were performed. The experiment as simulated auditory filter's performance evaluation method used perceptual evaluation of speech quality (PESQ) and log likelihood ratio (LLR) for speech through auditory filter. In the test, processed speech was evaluated objective speech quality and distortion using PESQ and LLR value. When hearing loss processed, PESQ and LLR value have big difference between symmetrical and asymmetrical auditory filter. It means that the difference of the shape auditory filter may affect to speech quality. Especially, when hearing loss existed, auditory filter changing according to asymmetrical shape for each center frequency affected to perceive speech quality of the hearing impaired.

  • PDF

Non-Intrusive Speech Quality Estimation of G.729 Codec using a Packet Loss Effect Model (G.729 코덱의 패킷 손실 영향 모델을 이용한 비 침입적 음질 예측 기법)

  • Lee, Min-Ki;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.2
    • /
    • pp.157-166
    • /
    • 2013
  • This paper proposes a non-intrusive speech quality estimation method considering the effects of packet loss to perceptual quality. Packet loss is a major reason of quality degradation in a packet based speech communications network, whose effects are different according to the input speech characteristics or the performance of the embedded packet loss concealment (PLC) algorithm. For the quality estimation system that involves packet loss effects, we first observe the packet loss of G.729 codec which is one of narrowband codec in VoIP system. In order to quantify the lost packet affects, we design a classification algorithm only using speech parameters of G.729 decoder. Then, the degradation values of each class are iteratively selected that maximizes the correlation with the degradation PESQ-LQ scores, and total quality degradation is modeled by the weighted sum. From analyzing the correlation measures, we obtained correlation values of 0.8950 for the intrusive model and 0.8911 for the non-intrusive method.

Speech Enhancement Based on Improved Minima Controlled Recursive Averaging Incorporating GSAP (전역 음성 부재 확률 기반의 향상된 최소값 제어 재귀평균기법을 이용한 음성 향상 기법)

  • Song, Ji-Hyun;Bang, Dong-Hyeouck;Lee, Sang-Min
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.1
    • /
    • pp.104-111
    • /
    • 2012
  • In this paper, we propose a novel method to improve the performance of the improved minima controlled recursive averaging (IMCRA). From an examination for various noise environment, it is shown that the IMCRA has a fundamental drawback for the noise power estimate at the offset region of continuity speech signals. Espectially, it is difficult to obtain the robust estimates of the noise power in non-stationary noisy environments that is rapidly changed the spectral characteristics such as babble noise. To overcome the drawback, we apply the global speech absence probability (GSAP) conditioned on both a priori SNR and a posteriori SNR to the speech detection algorithm of IMCRA. With the performance criteria of the ITU-T P.862 perceptual evaluation of speech quality (PESQ) and a composite measure test, we show that the proposed algorithm yields better results compared to the conventional IMCRA-based scheme under various noise environments. In particular, in the case of babble 5 dB, the proposed method produced a remarkable improvement compared to the IMCRA ( PESQ = 0.026, composite measure = 0.029 ).