• Title/Summary/Keyword: Perceptual acoustic parameter

Search Result 11, Processing Time 0.025 seconds

Experimental Study on Subjective Evaluation of Car Interior Sound Quality (승용차 내부소음의 음질평가 실험연구)

  • 최병호;아우구스트쉬크
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2003.11a
    • /
    • pp.177-182
    • /
    • 2003
  • This study is directed toward determining the number and characteristics of psychologically meaningful perceptual dimensions required for assessing the sound Ouaiity with respect to vehicle interior and/or exterior noises. and toward identifying the acoustical or psychoacoustical bases underlying the perception. By nonmetric MDS and clustring analysis of sound quality data sets on our own, of critical importance are two perceptual dimensions for which subjective verdicts can be interpreted as loudness and sharpness. The perceptual dimensions based upon similarity judgments could be accounted for 48% and 24% of the variance. each of which might be a match for the acoustic parameter "A-weighted maximum pressure level"(r= .85) and for the psychoacoustic parameter "sharpness" (r= .65), respectively. On the other hand, the perceptual dimensions based upon preference ratings could explain 66% and 10% of the variance. where the acoustic parameter "A-weighted maximum pressure leve"(r= .92) might be taken to be a best predictor, but sharpness appeared to be less suitable for the description of Preference behavior. Linked to the results, the problems of quantitative modelling of subjective sound quality evaluation and also of implementing corresponding cognitive combination rule for technical and industrial applications, say having "winner-sound qualify" according to preference criteria will be shortly in discussion.

  • PDF

SPATIAL EXPLANATIONS OF SPEECH PERCEPTION: A STUDY OF FRICATIVES

  • Choo, Won;Mark Huckvale
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.399-403
    • /
    • 1996
  • This paper addresses issues of perceptual constancy in speech perception through the use of a spatial metaphor for speech sound identity as opposed to a more conventional characterisation with multiple interacting acoustic cues. This spatial representation leads to a correlation between phonetic, acoustic and auditory analyses of speech sounds which can serve as the basis for a model of speech perception based on the general auditory characteristics of sounds. The correlations between the phonetic, perceptual and auditory spaces of the set of English voiceless fricatives /f $\theta$ s $\int$ h / are investigated. The results show that the perception of fricative segments may be explained in terms of 2-dimensional auditory space in which each segment occupies a region. The dimensions of the space were found to be the frequency of the main spectral peak and the 'peakiness' of spectra. These results support the view that perception of a segment is based on its occupancy of a multi-dimensional parameter space. In this way, final perceptual decisions on segments can be postponed until higher level constraints can also be met.

  • PDF

The Utility of Perturbation, Non-linear dynamic, and Cepstrum measures of dysphonia according to Signal Typing (음성 신호 분류에 따른 장애 음성의 변동률 분석, 비선형 동적 분석, 캡스트럼 분석의 유용성)

  • Choi, Seong Hee;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.63-72
    • /
    • 2014
  • The current study assessed the utility of acoustic analyses the most commonly used in routine clinical voice assessment including perturbation, nonlinear dynamic analysis, and Spectral/Cepstrum analysis based on signal typing of dysphonic voices and investigated their applicability of clinical acoustic analysis methods. A total of 70 dysphonic voice samples were classified with signal typing using narrowband spectrogram. Traditional parameters of %jitter, %shimmer, and signal-to-noise ratio were calculated for the signals using TF32 and correlation dimension(D2) of nonlinear dynamic parameter and spectral/cepstral measures including mean CPP, CPP_sd, CPPf0, CPPf0_sd, L/H ratio, and L/H ratio_sd were also calculated with ADSV(Analysis of Dysphonia in Speech and VoiceTM). Auditory perceptual analysis was performed by two blinded speech-language pathologists with GRBAS. The results showed that nearly periodic Type 1 signals were all functional dysphonia and Type 4 signals were comprised of neurogenic and organic voice disorders. Only Type 1 voice signals were reliable for perturbation analysis in this study. Significant signal typing-related differences were found in all acoustic and auditory-perceptual measures. SNR, CPP, L/H ratio values for Type 4 were significantly lower than those of other voice signals and significant higher %jitter, %shimmer were observed in Type 4 voice signals(p<.001). Additionally, with increase of signal type, D2 values significantly increased and more complex and nonlinear patterns were represented. Nevertheless, voice signals with highly noise component associated with breathiness were not able to obtain D2. In particular, CPP, was highly sensitive with voice quality 'G', 'R', 'B' than any other acoustic measures. Thus, Spectral and cepstral analyses may be applied for more severe dysphonic voices such as Type 4 signals and CPP can be more accurate and predictive acoustic marker in measuring voice quality and severity in dysphonia.

The Effect of An Increase of Closed Quotient on Improvement of Voice Quality after Type I Thyroplasty in Patients with Unilateral Vocal Cord Paralysis (일측 성대마비 환자에서 성대내전술 후 성대접촉율의 증가가 음질 개선에 미치는 영향)

  • Kim, Han-Su;Choi, Seung-Hee;Lim, Jae-Yol;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.1
    • /
    • pp.16-20
    • /
    • 2004
  • Purpose : To assess perceptual, acoustic and aerodynamic measure of voice quality in patients with unilateral vocal cord paralysis before and after type I thyroplasty. Methods : The clinical records of patients operated type I thyroplasty in the Departement of otorhinoalryngolgy, Yongdong Severance hospital from November 2001 to November 2003 were reviewed. All patients uderwent a vocal function evaluation including perceptual, acoustic and aerodynamic measures of voice preoperative and on $60^{th}$ postoperative day. The perceptual and acoustic measures were obtained from recording of patients' reading a 'Sanchak' passage. The perceptual evaluation was performed by 2 speech pathologist using a 4-point rating scale. Acoustic parameters(voice range profile low(RAL), voice range profile high(RAH), average fundamental frequency(AFX), closed quotient, harmonic to noise ratio, jitter and shimmer) were investigated by Lx speech studio. Mean flow rate(MFR), subglottic pressure(Psub) and intensity were measured using the Phonatory function analyzer. The maximum phonation time was also measured. The data were statistically analyzed. A paired t-test (p<0.1) was used to compare preoperative and postoperative results. And multiple regression test was used to find which parameter was most correlated to improvement of postoperative voice quality. Results : Among aerodynamic parameters, Psub $(88.11mmH_2O{\rightarrow}58.7mmH_2O)$, MPT(7.87sec${\rightarrow}$12.53sec), MFR (359.8ml/sec${\rightarrow}$161.06ml/sec) were statistically improved. AFx(205.5Hz${\rightarrow}$163.27Hz), AQx(23.9%${\rightarrow}$48.3%), RAL, RAH. Jotter and shimmer were improved. In multiple regression test, AFx and AQx was noted as the two meost correlated parameters to improvement of postoperative breathiness. But general grade of voice quality was more correlated to Psub and shimmer. Conclusion : Vocal fold medialization procedures effectively reduce glottic gap. Increasing of contact area of both vocal folds induced improvement in aerodynamic parameters and leaded stabilizing of vocal fold vibration. That effect results in improvement in acoustic parameters (shimmer, jitter, signal-to-noise ratio, voice range profile) and voice quality.

  • PDF

A Cepstral Analysis of Breathy Voice with Vocal Fold Paralysis (성대마비로 인한 기식 음성에 대한 Cepstral 분석)

  • Kang, Young-Ae;Seong, Cheol-Jae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.89-94
    • /
    • 2012
  • The aim of this study is to investigate the usefulness of the parameter CPP (cepstral peak prominence) and LTAS (long term average spectrum) band energy for an analysis of breathy voice with vocal fold paralysis. Thirty-four female subjects who have vocal paralysis after thyroidectomy participated in this study. According to the perceptual judgements by three speech pathologists and one phonetic scholar, subjects were divided into two groups: breathy voice group (n = 21) and non-breathy voice group (n = 13). Maximum sustained phonation task was measured for acoustic analysis. CPP-related (i.e. mean F0, mean CPP, and mean CPPs) and LTAS-related (i.e. minimum, maximum, and mean) parameters were used. Independent samples t-test was conducted. Regarding CPP, there are significant differences in mean CPP and mean CPPs between groups. The values of mean CPP and CPPs in the non-breathy voice group are higher than those in the breathy voice group. The CPP could be regarded as the useful parameter for breathy voice analysis in the clinic. When it comes to LTAS, energy from 0 to 2 kHz are significantly different between groups. The minimum value of non-breathy group is lower than that of breathy group, whereas the maximum value of non-breathy group is higher. The frequency band below 2 kHz seems to be related to breathy voice.

Acoustic analysis of wet voice among patients with swallowing disorders (삼킴장애 환자의 wet voice 관련 음향학적 분석)

  • Kang, Young Ae;Koo, Bon Seok;Kwon, In Sun;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.147-154
    • /
    • 2018
  • Wet voice quality (WVQ) is a characteristic that appears after swallowing. Although the concept is accepted by many clinicians worldwide, it is nevertheless ambiguous. In this study, we investigated WVQ in patients with swallowing disorders using acoustic analysis. A total of 106 patients diagnosed with penetration-aspiration by the videofluoroscopic swallowing study (VFSS) were recruited. A voice recording of vowel /a/ was conducted before and after the VFSS, and an acoustic analysis was then performed using PRAAT. Voice after VFSS was used for a perceptual judgment and divided into two groups: the Wet group (48 patients) and the Non-wet group (58 patients). At the post-VFSS stage, the two groups displayed significant differences in many acoustic parameters including F0_SD, Jitter, RAP, Shimmer, APQ, HNR, NHR, FUF, DVB, and CPP. The parameter affecting judging wetness resulted into Jitter and NHR by the logistic regression test. At the pre-VFSS stage, the two groups differed significantly in many acoustic parameters including Intensity, Jitter, RAP, Shimmer, NHR, FUF, DVB, and CPP. Both pre-and post-VFSS, the mean values of all significant parameters, except Intensity, HNR, and CPP, were higher in the Wet group. According to pre-and post-VFSS, the two groups displayed interactions in many parameters (Intensity, F0_SD, Jitter, RAP, Shimmer, APQ, HNR, NHR, FUF, DVB, and CPP). In particular, Intensity increased in both groups after the VFSS, although the increase in the Non-wet group was greater. Based on these results, it was conjectured that the WVQ after swallowing resulted from the secretion effect of the mucous membrane due to the dry laryngeal characteristic of elderly patients, rather than aspiration resulting in food on the vocal cords.

The Management and Evaluation of Speech in Cleft Palate Patients (구개열환자의 언어관리 및 평가)

  • Shin Hyo-Keun;Kim Hyun-Gi
    • Proceedings of the KSPS conference
    • /
    • 1996.02a
    • /
    • pp.23-40
    • /
    • 1996
  • The communicative disorders in cleft palate patients have relationship with the acoustic and He physiological phenomena. Particularily hypernasality is a parameter of cleft palate speech that has been studied by many clinicians and speech pathologists. The degree of hypernasality has been assessed by the listener,s judgement, but perceptual assessements have poor scientific reliability, so objective instruments have been needed to test hypernasality with diagnostics accuracy. This study was analyzed the nasalance score using a Nasometer for cleft palate patients. The simple vowels /a/, /i/, /e/ and the approximants /j/, /w/ were tested for the degree of hypernasality after operation. The phrases containing long and short duration times were used in this study to asses hypeernasality. Fiberopic views shows the open velopharyngeal port that resulted in hypernasality of cleft palate patients. The authors assert the important of the management of cleft palate patients.

  • PDF

Aerodynamics of Speech using Aerophone II (Aerophone II를 이용한 조음적 공기역학검사)

  • 홍기환
    • Proceedings of the KSLP Conference
    • /
    • 1995.11a
    • /
    • pp.165-180
    • /
    • 1995
  • 복잡한 음성장애를 이해하기 위해서는 음성관에 대한 여러 단계에서의 정량적인 검사가 이루어져야 한다. 이를 위하여 여러 가지 검사 법이 이용되고 있는데 예를 들면 음성의 인지적 검사(perceptual intelligibility), 음향음성학적검사(acoustic analysis), 공기역동학적 검사(aerodynamic study), 후두구조물의 운동 관찰, 그리고 근과 신경의 기능 검사(electromyographic study)등이 있다. 이중 인지적 검사는 청취자 동의 문제와 검사 법에 대하여 문제점이 제기 되기도 하며 발화 중 후두기능의 병태생리에 관한 추론적인 정보만을 제공한다는 문제점이 있다. 음향음성 검사는 이미 잘 알려진 상태로서 많은 parameter들이 측정되어온 것이 사실이나 그 유용성에 대해서도 아직 논란이 있으며 단지 성대의 진동에 의해 나타나는 현상만을 이용한 검사로서 일종의 정지성 연구에 불과한 것이 사실이다. (중략)

  • PDF

Audio Contents Adaptation Technology According to User′s Preference on Sound Fields (사용자의 음장선호도에 따른 오디오 콘텐츠 적응 기술)

  • 강경옥;홍재근;서정일
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.6
    • /
    • pp.437-445
    • /
    • 2004
  • In this paper. we describe a novel method for transforming audio contents according to user's preference on sound field. Sound field effect technologies. which transform or simulate acoustic environments as user's preference, are very important for enlarging the reality of acoustic scene. However huge amount of computational power is required to process sound field effect in real time. so it is hard to implement this functionality at the portable audio devices such as MP3 player. In this paper, we propose an efficient method for providing sound field effect to audio contents independent of terminal's computational power through processing this functionality at the server using user's sound field preference, which is transfered from terminal side. To describe sound field preference, user can use perceptual acoustic parameters as well as the URI address of room impulse response signal. In addition, a novel fast convolution method is presented to implement a sound field effect engine as a result of convoluting with a room impulse response signal at the realtime application. and verified to be applicable to real-time applications through experiments. To verify the evidence of benefit of proposed method we performed two subjective listening tests about sound field descrimitive ability and preference on sound field processed sounds. The results showed that the proposed sound field preference can be applicable to the public.

Perceptual cues for /o/ and /u/ in Seoul Korean (서울말 /?/와 /?/의 지각특성)

  • Byun, Hi-Gyung
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.1-14
    • /
    • 2020
  • Previous studies have confirmed that /o/ and /u/ in Seoul Korean are undergoing a merger in the F1/F2 space, especially for female speakers. As a substitute parameter for formants, it is reported that female speakers use phonation (H1-H2) differences to distinguish /o/ from /u/. This study aimed to explore whether H1-H2 values are being used as perceptual cues for /o/-/u/. A perception test was conducted with 35 college students using /o/ and /u/ spoken by 41 females, which overlap considerably in the vowel space. An acoustic analysis of 182 stimuli was also conducted to see if there is any correspondence between production and perception. The identification rate was 89% on average, 86% for /o/, and 91% for /u/. The results confirmed that when /o/ and /u/ cannot be distinguished in the F1/F2 space because they are too close, H1-H2 differences contribute significantly to the separation of the two vowels. However, in perception, this was not the case. H1-H2 values were not significantly involved in the identification process, and the formants (especially F2) were still dominant cues. The study also showed that even though H1-H2 differences are apparent in females' production, males do not use H1-H2 in their production, and both females and males do not use H1-H2 in their perception. It is presumed that H1-H2 has not yet been developed as a perceptual cue for /o/ and /u/.