• Title/Summary/Keyword: speech quality evaluation

Search Result 175, Processing Time 0.024 seconds

Correlation Analysis of PESQ and MOS Evaluation for HMM-based Synthetic Korean Speech (HMM 기반의 한국어 합성음에 대한 PESQ 및 MOS 평가의 상관도 분석)

  • Lin, Cang-Song;Bae, Keun-Sung
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.71-75
    • /
    • 2010
  • The PESQ is an objective speech quality evaluation measure that is known to have a high correlation with a subjective speech quality measure such as MOS. To examine whether it could be useful as an objective quality measure of synthetic speech, we carried out both subjective evaluation tests with MOS and DMOS and an objective evaluation test with PESQ for HMM-based Korean synthetic speech signals and analyzed the correlation between them. Experimental results have shown that the PESQ has correlations of 0.87 with MOS and 0.92 with DMOS. It means that the PESQ holds much promise for evaluating the quality of synthetic Korean speech.

  • PDF

A Study on Objective Quality Assessment for Synthesized speech by Rule (규칙합성음의 객관적 품질평가에 관한 연구)

  • 홍진우;김순협
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.10
    • /
    • pp.42-49
    • /
    • 1993
  • In this paper, we evaluate the quality of synthesized speech by rule using the LPC CD as a objective measure, and then compare the test result with the subjective one. Speech used for the test consists of 108 words which are selected by word construction method using Korean attribute and frequency distribution, synthesized by demi-syllable rule. By evaluating the quality of synthesized speech by reule objectively, we have tried to resolve the problems such as lots of evaluation time, expansion of test scale, and variables of analysis result arised by subjective measure. We have, also, proved the validity of the objective test using the LPC CD, by comparing intelligibility which is the index for the subjective quality evaluation of synthesized speech by rule with MOS. From this results, we can provide a guide for quality assessment that would be useful in the R&D of synthesis method and the commercial products using synthesized speech.

  • PDF

A Study on Objective Quality Assessment of Synthesized Speech by Rule (규칙 합성음의 객관적 품질평가에 관한 연구)

  • 홍진우
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1991.06a
    • /
    • pp.67-72
    • /
    • 1991
  • This paper evaluates thequality of synthesized speech by rule using the LPC CD in the objective measure and then compares the result with the subjective analysis. By evaluating the quality of synthesized speech by rule objectively. We have tried to resolve the problems (Evaluation time or size expansion, variables within the analysis results) that arise when the evaluation is done subjectively. Also by comparing intelligibility-the index for the subjective quality evaluation of synthesized speech by rule-with evaluation results obtained using MOS and the objective evaluation. We have proved the validity of the objective analysis and thus provides a guide that would be useful when R&D and marketing of synthesis by rule method is done.

  • PDF

A Speech Enhancement Algorithm based on Human Psychoacoustic Property (심리음향 특성을 이용한 음성 향상 알고리즘)

  • Jeon, Yu-Yong;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.6
    • /
    • pp.1120-1125
    • /
    • 2010
  • In the speech system, for example hearing aid as well as speech communication, speech quality is degraded by environmental noise. In this study, to enhance the speech quality which is degraded by environmental speech, we proposed an algorithm to reduce the noise and reinforce the speech. The minima controlled recursive averaging (MCRA) algorithm is used to estimate the noise spectrum and spectral weighting factor is used to reduce the noise. And partial masking effect which is one of the human hearing properties is introduced to reinforce the speech. Then we compared the waveform, spectrogram, Perceptual Evaluation of Speech Quality (PESQ) and segmental Signal to Noise Ratio (segSNR) between original speech, noisy speech, noise reduced speech and enhanced speech by proposed method. As a result, enhanced speech by proposed method is reinforced in high frequency which is degraded by noise, and PESQ, segSNR is enhanced. It means that the speech quality is enhanced.

Performance Evaluation of Novel AMDF-Based Pitch Detection Scheme

  • Kumar, Sandeep
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.425-434
    • /
    • 2016
  • A novel average magnitude difference function (AMDF)-based pitch detection scheme (PDS) is proposed to achieve better performance in speech quality. A performance evaluation of the proposed PDS is carried out through both a simulation and a real-time implementation of a speech analysis-synthesis system. The parameters used to compare the performance of the proposed PDS with that of PDSs that are based on either a cepstrum, an autocorrelation function (ACF), an AMDF, or circular AMDF (CAMDF) methods are as follows: percentage gross pitch error (%GPE); a subjective listening test; an objective speech quality assessment; a speech intelligibility test; a synthesized speech waveform; computation time; and memory consumption. The proposed PDS results in lower %GPE and better synthesized speech quality and intelligibility for different speech signals as compared to the cepstrum-, ACF-, AMDF-, and CAMDF-based PDSs. The computational time of the proposed PDS is also less than that for the cepstrum-, ACF-, and CAMDF-based PDSs. Moreover, the total memory consumed by the proposed PDS is less than that for the ACF- and cepstrum-based PDSs.

A Study on the Objective Evaluation Model of Telephone Transmission Quality (통화품질 객관평가 모델링에 관한 연구)

  • 조재철;박순영;방만원
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.16 no.6
    • /
    • pp.509-516
    • /
    • 1991
  • In this paper, we propose on objective evaluation model of telephone transmission qulity in order to estimate a satisfaction score regarding speech quality in a relephone network. As the degradantion factors of telephone transmission quality, this model takes into account transmission loss, noise, distortion, talker echo and sidetone. A performance index[PI] is introduced for five psychological factors affecting telephone speech qualty, and a Mean Opinion Score(MOS) is estimated from the sum of all Pis. The simulation results indicate theat the MOS obtained from the objective evaluation model is in good agreement with that of subjective evaluation.

  • PDF

Characteristics of voice quality on clear versus casual speech in individuals with Parkinson's disease (명료발화와 보통발화에서 파킨슨병환자 음성의 켑스트럼 및 스펙트럼 분석)

  • Shin, Hee-Baek;Shim, Hee-Jeong;Jung, Hun;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.77-84
    • /
    • 2018
  • The purpose of this study is to examine the acoustic characteristics of Parkinsonian speech, with respect to different utterance conditions, by employing acoustic/auditory-perceptual analysis. The subjects of the study were 15 patients (M=7, F=8) with Parkinson's disease who were asked to read out sentences under different utterance conditions (clear/casual). The sentences read out by each subject were recorded, and the recorded speech was subjected to cepstrum and spectrum analysis using Analysis of Dysphonia in Speech and Voice (ADSV). Additionally, auditory-perceptual evaluation of the recorded speech was conducted with respect to breathiness and loudness. Results indicate that in the case of clear speech, there was a statistically significant increase in the cepstral peak prominence (CPP), and a decrease in the L/H ratio SD (ratio of low to high frequency spectral energy SD) and CPP F0 SD values. In the auditory-perceptual evaluation, a decrease in breathiness and an increase in loudness were noted. Furthermore, CPP was found to be highly correlated to breathiness and loudness. This provides objective evidence of the immediate usefulness of clear speech intervention in improving the voice quality of Parkinsonian speech.

Performance Evaluation of Frame Erasure Concealment Algorithms in VoIP Coders (VoIP 코더들의 프레임손실은닉 알고리즘 성능평가)

  • Han, Seung-Ho;Moon, Kwang;Han, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2004.05a
    • /
    • pp.235-238
    • /
    • 2004
  • Frame erasures cause speech quality degradation in wireless communication networks or packet networks. The degradation becomes worse when consecutive frame erasures occur. Speech coders have a frame erasure concealment(FEC) mechanism to compensate for frame erasures. It is meaningful to evaluate the performance of FEC mechanisms for frame erasures that occur in communications networks. In this paper, various frame erasures are designed. And the FEC algorithms of speech coders are evaluated and analyzed with the Perceptual Evaluation of Speech Quality(PESQ). It is found that the performances vary in accordance with frame erasure types, frame erasure rates, and utterance lengths.

  • PDF

Bandwidth Expansion Method Using Spline Codebook Based Spectral Folding (Spline 코드북 기반의 spectral folding을 이용한 대역폭 확장 방법)

  • Park, Ji-Hoon;Han, Seung-Ho;Yang, Hee-Sik;Jeong, Sang-Bae;Hahn, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.131-134
    • /
    • 2006
  • Quality of narrowband speech $(0{\sim}4kHz)$ can be enhanced by the bandwidth expansion technique, by which the high- band components are estimated. This paper proposes the bandwidth expansion method using the spline codebook based spectral folding. For the performance evaluation, the PESQ(Perceptual Evaluation of Speech Quality) scores are measured as the objective measurement In addition, the MOS (Mean Opinion Score) and the preference tests are performed as the subjective measurement. The results show our proposed method outperforms the existing spline based one.

  • PDF

Discussions on Auditory-Perceptual Evaluation Performed in Patients With Voice Disorders (음성장애 환자에서 시행되는 청지각적 평가에 대한 논의)

  • Lee, Seung Jin
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.32 no.3
    • /
    • pp.109-117
    • /
    • 2021
  • The auditory-perceptual evaluation of speech-language pathologists (SLP) in patients with voice disorders is often regarded as a touchstone in the multi-dimensional voice evaluation procedures and provides important information not available in other assessment modalities. Therefore, it is necessary for the SLPs to conduct a comprehensive and in-depth evaluation of not only voice but also the overall speech production mechanism, and they often encounter various difficulties in the evaluation process. In addition, SLPs should strive to avoid bias during the evaluation process and to maintain a wide and constant spectrum of severity for each parameter of voice quality. Lastly, it is very important for the SLPs to perform a team approach by documenting and delivering important information pertaining to auditory-perceptual characteristics in an appropriate and efficient way through close communication with the laryngologists.