• Title/Summary/Keyword: speech quality evaluation

Search Result 178, Processing Time 0.028 seconds

A Study on PCFBD-MPC in 8kbps (8kbps에 있어서 PCFBD-MPC에 관한 연구)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.5
    • /
    • pp.17-22
    • /
    • 2017
  • In a MPC coding using excitation source of voiced and unvoiced, it would be a distortion of speech waveform. This is caused by normalization of synthesis speech waveform of voiced in the process of restoration the multi-pulses of representation section. This paper present PCFBD-MPC( Position Compensation Frequency Band Division-Multi Pulse Coding ) used V/UV/S( Voiced / Unvoiced / Silence ) switching, position compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. Also, I was implemented that the PCFBD-MPC( Position Compensation Frequency Band Division-Multi Pulse Coding ) system and evaluate the SNRseg of PCFBD-MPC in coding condition of 8kbps. As a result, SNRseg of PCFBD-MPC was 13.4dB for female voice and 13.8dB for male voice respectively. In the future, I will study the evaluation of the sound quality of 8kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or a smart phone.

An ACLMS-MPC Coding Method Integrated with ACFBD-MPC and LMS-MPC at 8kbps bit rate. (8kbps 비트율을 갖는 ACFBD-MPC와 LMS-MPC를 통합한 ACLMS-MPC 부호화 방식)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.1-7
    • /
    • 2018
  • This paper present an 8kbps ACLMS-MPC(Amplitude Compensation and Least Mean Square - Multi Pulse Coding) coding method integrated with ACFBD-MPC(Amplitude Compensation Frequency Band Division - Multi Pulse Coding) and LMS-MPC(Least Mean Square - Multi Pulse Coding) used V/UV/S(Voiced / Unvoiced / Silence) switching, compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. In integrating several methods, it is important to adjust the bit rate of voiced and unvoiced sound source to 8kbps while reducing the distortion of the speech waveform. In adjusting the bit rate of voiced and unvoiced sound source to 8 kbps, the speech waveform can be synthesized efficiently by restoring the individual pitch intervals using multi pulse in the representative interval. I was implemented that the ACLMS-MPC method and evaluate the SNR of APC-LMS in coding condition in 8kbps. As a result, SNR of ACLMS-MPC was 15.0dB for female voice and 14.3dB for male voice respectively. Therefore, I found that ACLMS-MPC was improved by 0.3dB~1.8dB for male voice and 0.3dB~1.6dB for female voice compared to existing MPC, ACFBD-MPC and LMS-MPC. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or internet phone. In the future, I will study the evaluation of the sound quality of 6.9kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source.

Transcoding Algorithm for AMR and EVRC Vocoders Via Direct Parameter Transformation (AMR과 EVRC 음성부호화기를 위한 파라미터 직접 변환 방식의 상호부호화 알고리듬)

  • Lee, Sun-Il;Yu, Chang-Dong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.6
    • /
    • pp.696-708
    • /
    • 2002
  • In this paper, a novel transcoding algorithm for the Adaptive Multi Rate(AMR) and the Enhanced Variable Rate Codec(EVRC) vocoders via direct parameter transformation is proposed. In contrast to the conventional tandem transcoding algorithm, the proposed algorithm converts the parameters of one coder to the other without going through the decoding and encoding processes. The proposed algorithm consists of the parameter decoding, frame classification, mode decision, and transcoders for two frame types. The transcoders convert the parameters such as LSP, frame energy, pitch delay for the adaptive codebook, fixed codebook vector, and codebook gains. Evaluation results show that while exhibiting better computational and delay characteristics, the proposed algorithm produces equivalent speech quality to that produced by the tandem transcoding algorithm.

Effects of Motor Learning Guided Laryngeal Motor Control Therapy for Muscle Misuse Dysphonia (운동학습이론에 기초한 발성운동조절법이 근오용성 발성장애의 음성에 미치는 효과)

  • Seo, In-Hyo;Lee, Ok-Bun;Lee, Sang-Joon;Chung, Phil-Sang
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.133-140
    • /
    • 2011
  • Muscle misuse dysphonia (MMD) is defined as a behavioral voice disorder resulting from inappropriate contractions of intrinsic and/or extrinsic laryngeal muscles. The purpose of this study was to investigate the effect of motor learning guided laryngeal motor control therapy (MLG-LMCT) which is designed to improve an existing LMT and further the effective voice treatment on people with muscle misuse dysphonia. Forty-six people with MMD (M:F=16:30) participated in this study. The voice samples of the participants were recorded to investigate the effect of MLG-LMCT before and after the voice therapy. Voice samples were analyzed via electro-glotto-graph (EGG). Contact quotient (CQ), speed quotient (SQ), and waveform were reported. In addition, perceptual and acoustical evaluation were conducted to determine the change of voice improvement after treatment. The experimenter massaged the tensioned muscles around the neck. In order to find more proper phonation the experimenter showed the subjects their EGG wave forms as to whether or not they are moving the vocal folds to the appropriate position. Therefore, the EGG wave forms were used as a type of visual feedback. With the wave form, the experimenter helped subjects move the vocal folds and laryngeal muscles to find more proper voice production. The sensory stimuli from the experimenter gradually faded out. A paired dependent t- test revealed that there was significant differences in CQ between pre- and post-therapy. Perceptually, overall, rough, breathy, strain, and transition were significantly reduced. Acoustically, there were significant differences in Fo, jitter, shimmer, and NHR. After using MLG-LMCT, most of the subjects showed improvements in voice quality. The results from this study led us to the following conclusions: Motor learning guided laryngeal motor control therapy (MLG-LMCT) has reduces muscle misuse dysphonia. These results may occur because a visual feedback from EGG wave form can maintain the effect of the muscle tension reduction from laryngeal manual therapy. In case of people with MMD who reduced muscle tension from the therapy (LMT) but, not appropriately manipulating the location of larynx or adducting the vocal folds, MLG-LMCT might be an alternative therapy approach.

  • PDF

Objective Evaluation of Beamforming Techniques for Hearing Devices with Two-channel Microphone (2채널 마이크로폰을 이용한 청각 기기에서의 빔포밍에 대한 객관적 검증)

  • Cho, Kyeong-Won;Han, Jong-Hee;Hong, Sung-Hwa;Lee, Sang-Min;Kim, Dong-Wook;Kim, In-Young;Kim, Sun-I.
    • Journal of Biomedical Engineering Research
    • /
    • v.32 no.3
    • /
    • pp.198-206
    • /
    • 2011
  • Hearing devices like cochlear implant, vibrant soundbridge, etc. try to offer better sound for people. In hearing devices, several beamformers including conventional directional microphone are applicable to noise reduction. Each beamformer has different directional response and it could change sound intelligibility or quality for listeners. Therefore, we investigated the performance of three beamformers, which are first and second order directional microphone, and broadband beamformer(BBF) with a computer simulation assuming hearing device microphone configuration. We also calculated objective measurements which have been used to evaluate speech enhancement algorithms. In the simulation, a single speech and a single babble noisewere propagated from the front and $135^{\circ}$ azimuth degrees respectively. Microphones were configured in an end-fire array and the spacing was varied in comparison. With 3 cm spacing, BBF had about 3 dB higher enhanced SNR than that of directional microphones. However, enhancement of segmental SNR and frequency weighted segmental SNR were similar between the first order directional microphone and broadband beamformer. In addition when steady state noise was used, broadband beamformer showed the increased performance and had the highest enhanced SNR, and segmental SNR.

Electro-Acupuncture on Aphasia after Stroke: A Systemic Review of Randomized Controlled Trials (뇌졸중 환자의 실어증에 대한 전침 치료 : 체계적 문헌 고찰)

  • Ha, Jeong-been;Lee, Su-jung;Yang, Ji-soo;Lew, Jae-hwan
    • The Journal of Internal Korean Medicine
    • /
    • v.42 no.3
    • /
    • pp.323-339
    • /
    • 2021
  • Objectives: This study investigates the effect of electro-acupuncture on aphasia after stroke. Methods: A search of OASIS, NDSL, PubMed, Cochrane, and CNKI was executed between 4 January 2021 and 4 February 2021, with no limitation on publication year. Extraction and selection from the studies were made by 3 authors. The quality of the studies was evaluated using Cochrane's risk of bias (RoB) tool. Results: 10 studies met the selection criteria. As the treatment site for electro-acupuncture, GV20 (Baihui) was used the most. In all studies, the region located on the head was used for treatment without distinguishing between acupoints and areas of scalp acupuncture, and the stimulation was organized into 3 conditions: speed, intensity, and time. The outcome indicators used before and after treatment focused on the evaluation of language function and the degree of aphasia. The results showed that using electro-acupuncture with speech rehabilitation therapy for aphasia after stroke was more effective than using speech rehabilitation therapy alone. Conclusions: In this review, electro-acupuncture for aphasia after stroke was found to have a significant effect compared to the previous treatment alone. However, because of limitations, information was not reliable enough. Additional research is needed to produce more objective evidence.

Blind Noise Separation Method of Convolutive Mixed Signals (컨볼루션 혼합신호의 암묵 잡음분리방법)

  • Lee, Haeng-Woo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.3
    • /
    • pp.409-416
    • /
    • 2022
  • This paper relates to the blind noise separation method of time-delayed convolutive mixed signals. Since the mixed model of acoustic signals in a closed space is multi-channel, a convolutive blind signal separation method is applied and time-delayed data samples of the two microphone input signals is used. For signal separation, the mixing coefficient is calculated using an inverse model rather than directly calculating the separation coefficient, and the coefficient update is performed by repeated calculations based on secondary statistical properties to estimate the speech signal. Many simulations were performed to verify the performance of the proposed blind signal separation. As a result of the simulation, noise separation using this method operates safely regardless of convolutive mixing, and PESQ is improved by 0.3 points compared to the general adaptive FIR filter structure.

A Study of Acoustic Characteristics of Two Syllables Words and Sustained Vowel (병적음성에 대한 지속 모음 및 이음절어 발화시 나타나는 음향학적 차이에 대한 연구)

  • 채윤정;김범규;홍기환
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.11 no.1
    • /
    • pp.104-112
    • /
    • 2000
  • An evaluation of voice disorder has two methods. One is a perceptual analysis and the other is an acoustic analysis. All of these methods are just focused on sustained vowel. The analysis of conversational speech levels in voice disorder has not been achieved enough. The purpose of the present study is to compare two syllable words and sustained vowel in the vocal polyp patients and normal male speakers and to be applied on the vocal assessment and the voice therapy as a basic data. fifteen male patients with vocal polyp were the subject group. Fifteen healthy male were the control group for this study. The voices of the subject and control group, saved in MDVP of CSL were analyzed by its own analysis program. As a results, in subject group, the voice qualities between the vowel following lenis stop and the sustained vowel had no differences, and the voice qualities were different significantly between the vowel following heavily aspirated stop and the sustained vowel. In the control group the vowel fllowing stops and sustained vowel had also many differences in their voice quality, especially significant between the vowel following glottal stop and e sustained vowel.

  • PDF

Velopharyngeal Insufficiency Accompanied with Hypertrophic Tonsils: A Case Report (편도비대를 동반한 구개인두부전 환자의 치험례)

  • Kim, Eun Key;Koh, Kyung Suck;Park, Mi Kyong
    • Archives of Plastic Surgery
    • /
    • v.32 no.5
    • /
    • pp.660-662
    • /
    • 2005
  • It is well documented that adenoidectomy is attributed to hypernasality in certain cases, but not clear that the enlarged tonsils affect the quality of speech. Hypertrophied tonsils may cause and complicate the problem of velopharyngeal incompetency. The huge tonsils prevent lateral pharyngeal walls from a medial movement and interfere velar elevation, being hypernasality. Hyponasality developes as the tonsils encroach in nasopharyngeal space. Voluminous tonsils also interfere airflow in the oropharyneal passage and produce the phenomenon of cul-de-sac resonance or muffled sound. The authors and et al. present a case of velopharyngeal insufficiency accompanied with hypertrophic tonsils. Improving the lateral constricting pharyngeal wall and velar elevation after tonsillectomy minimized the velopharyngeal gap. Accordingly, the procedures of sphincter pharyngoplasty and palatal lengthening resolved the problem of hypernasality instead of pharyngeal flap. Tonsillectomy prior to pharyngeal flap surgery tends to reduce the postoperative airway problems. Sometimes, however, only tonsillectomy does without pharyngeal flap. Surgical approach by stages and intermittent evaluation are recommended at intervals of at least six weeks.

Real-time Implementation of Variable Transmission Bit Rate Vocoder Integrating G.729A Vocoder and Reduction of the Computational Amount SOLA-B Algorithm Using the TMS320C5416 (TMS320C5416을 이용한 G.729A 보코더와 계산량 감소된 SOLA-B 알고리즘을 통합한 가변 전송율 보코더의 실시간 구현)

  • 함명규;배명진
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.84-89
    • /
    • 2003
  • In this paper, we real-time implemented to the TMS320C5416 the vocoder of variable bit rate applied the SOLA-B algorithm by Henja to the ITU-T G.729A vocoder of 8kbps transmission rate. This proposed method using the SOLA-B algorithm is that it is reduced the duration of the speech in encoding and is played at the speed of normal by extending the duration of the speech in decoding. At this time, we bandied that the interval of cross correlation function if skipped every 3 sample for decreasing the computational amount of SOLA-B algorithm. The real-time implemented vocoder of C.729A and SOLA-B algorithm is represented the complexity of maximum that is 10.2MIPS in encoder and 2.8MIPS in decoder of 8kbps transmission rate. Also, it is represented the complexity of maximum that is 18.5MIPS in encoder and 13.1MIPS in decoder of 6kbps, it is 18.5MIPS in encoder and 13.1MIPS in decoder of 4kbps. The used memory is about program ROM 9.7kwords, table ROM 4.5kwords, RAM 5.1 kwords. The waveform of output is showed by the result of C simulator and Bit Exact. Also, for evaluation of speech quality of the vocoder of real-time implemented variable bit rate, it is estimated the MOS score of 3.69 in 4kbps.