• Title/Summary/Keyword: vocal tract

Search Result 172, Processing Time 0.021 seconds

Phonetic meaning of clarity and turbidity (청탁의 음성학적 의미)

  • Park, Hansang
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.77-89
    • /
    • 2017
  • This study investigates the phonetic meaning of clarity and turbidity(淸濁) that has been used in psychoacoustics, musicology, and linguistics in both the East and the West. With a view to clarifying the phonetic meaning of clarity and turbidity, this study conducts three perception tests. First, 34 subjects were asked to take one of Clear and Turbid by forced choice for 5 pure and complex tones, respectively, ranging from A2 to A6 differing by octave. Second, they were asked to select between the two choices for 25 pure and complex tones, respectively, ranging from A2 to A4 differing by semitone. Third, they were asked to opt for one of the two choices for 8 different vowels of different formant and fundamental frequencies. Results showed that there is a certain range of tone which is perceived as clear, that clarity level increases as fundamental frequency increases, and that pure tones have a higher level of clarity than complex ones, fundamental frequency being equal. Results also showed that vocal tract resonance enhances clarity level on the whole, and that lower vowels have a higher level of clarity than higher ones. This study is significant in that it demonstrates that clarity level is proportional to fundamental frequency and the first formant frequency, all else being equal.

A comparison of acoustic & electroglottographic measures according to voiced lip trill methods (입술 트릴의 방법에 따른 음향학적 및 전기성문파형검사 측정치 비교)

  • Lee, Seung Jin;Lee, Kwang Yong;Lim, Jae-Yol;Choi, Hong-Shik
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.107-114
    • /
    • 2017
  • The purpose of the current study was to compare selected acoustic and electroglottographic measures (closed quotient, pitch, and loudness) among vowel phonation, traditional voiced lip trill ($VLT_T$), modified voiced lip trill methods ($VLT_M$). A total of 21 participants without voice complaints produced 4-second long samples using each phonation method. Results indicated that mean closed quotient of $VLT_M$ was higher than that of vowel phonation and $VLT_T$, while its range and standard deviation measures were higher than those of vowel phonation. Mean, range, standard deviation, maximum of pitch measures of $VLT_M$ were higher than those of vowel phonation. Lastly, mean and maximum loudness of the $VLT_M$ were higher than $VLT_T$. In conclusion, the current data indicate the possibility to use the $VLT_M$ as a training method for singing or a strategy to facilitate generalization effect of voice therapy. Current results also reflect the necessity for further study pertaining to the long-term effect of the $VLT_M$ training method. Clinical implications are discussed.

On a Pitch Change of the Waveform Coding by the Cepstrum Analysis of Speech Waveforms (켑스트럼 분석에 의한 파형부호화의 피치변경에 관한 연구)

  • Bae, Myung-Jin;Lee, Mi-Suk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.4
    • /
    • pp.14-21
    • /
    • 1992
  • The waveform coding is concerned with simply preserving the wave shape of speech signal through a redundancy reduction process. In area of the speech synthesis, the waveform codings with high quality are mainly used to the synthesis by analysis. However, because the parameters of this coding are not classified as either excitation parameters and vocal tract parameters, it is difficult to applying the waveform coding to the synthesis by rule. In this paper, we proposed a new pitch alternation method that can change the pitch periods in the waveform coding by using the cepstrum analysis. Thus, it is possible that the waveform coding is carried out the synthesis by rule in speech processing.

  • PDF

Analysis of the Relationship Between Sasang Constitutional Groups and Speech Features Based on a Listening Evaluation of Voice Characteristics (목소리 특성의 청취 평가에 기초한 사상체질과 음성 특징의 상관관계 분석)

  • Kwon, Chulhong;Kim, Jongyeol;Kim, Keunho;Jang, Junsu
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.71-77
    • /
    • 2012
  • Sasang constitution experts utilize voice characteristics as an auxiliary measure for deciding a person's constitutional group. This study aims at establishing a relationship between speech features and the constitutional groups by subjective listening evaluation of voice characteristics. A speech database of 841 speakers whose constitutional groups have been already diagnosed by Sasang constitution experts was constructed. Speech features related to speech source and vocal tract filter were extracted from five vowels and one sentence. Statistically significant speech features for classifying the groups were analyzed using SPSS. The features contributed to constitution classification were speaking rate, Energy, A1, A2, A3, H1, H2, H4, CPP for males in their 20s, F0_mean, CPP, SPI, HNR, Shimmer, Energy, A1, A2, A3, H1, H2, H4 for females in their 20s, Energy, A1, A2, A3, H1, H2, H4, CPP for male in the 60s, and Jitter, HNR, CPP, SPI for females in their 60s. Experimental results show that speech technology is useful in classifying constitutional groups.

Voice Rehabilitation Other than Tracheo - Esophageal Shunt Method - (후두적출자의 음성재활 - 기관식도천자법 이외의 방법 -)

  • Kim, Young-Ho
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.19 no.1
    • /
    • pp.28-30
    • /
    • 2008
  • The problem of voice restoration after total laryngectomy has existed ever since Billroth's first total laryngectomy in 1873. Since then, all the efforts to restore the voice was tried to divert the tracheal air to the pharynx to produce voice, which became the tracheo-esophageal shunt voice currently used. With the intact pharyngoesophagus, however, there are two basic options for speech rehabilitation : the artificial larynx and esophageal voice. The artificial larynx is an electrically driven buzzer or a sound transducer and its most common type is placed against a supple point on patient's neck and introduces a mechanical sound into the tissues and air spaces of the neck. This sound, emanating form the mouth, is articulated by the intact structures of the remaining vocal tract as understandable speech. Esophageal voice is a commonly recommended method for alaryngeal speech rehabilitation, which can be successfully done by regurgitating the air stored in the esophagus. Successful esophageal voice is preferable to the artificial larynx but, most patients usually adapt only one of those methods according to their needs and feasibility to learn.

  • PDF

The Study on the Acoustical Characteristics and Speech Intelligibility of Vowels Produced by the Maxillectomized Patients before and after Obturator-Wearing (Palatal Cancer환자의 Obturator 장착전후 모음의 음향학적 특성과 말 명료도에 관한 연구)

  • 최성희;정문규;김호중;표화영;심현섭;최홍식
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.10 no.2
    • /
    • pp.140-148
    • /
    • 1999
  • The use of obturator is the prosthetic rehabilitation approach for restoration of the defected maxillary shape and function for the patients with palatal defect. The obturator can change the shape of vocal tract and nasality, but few reports on the effects of the change were presented. So, the authors performed the experimental study to compare the difference between the sizes of vowel triangles produced by maxillectomized patients before and after obturator-wearing and to consider how much improvement in speech intelligibility can be expected by obturator wearing. The 8 patients who were totally maxillectomized due to palatal cancer were participated as subjects. They produced 5 vowels(/a/, /i/, /u/, /e/, /o/) before and after obturator-wearing. The formants of the vowels were analyzed by the spectrogram of CSL, and their speech intelligibility were judged by normal 8 listeners. As results, the frequency of the first and the second formant showed no significant difference between the articulation before and after wearing, but the comparison of the sizes of vowel triangles, related with the speech intelligibility, showed significant difference. The vowel triangle of the articulation after wearing was larger than that of the articulation before wearing. /i/ showed the lowest speech intelligibility score among the vowel articulation before wearing. After wearing obturators, their scores increased on the whole, especially, in /a/, but the intelligibility of /u/ decreased after wearing.

  • PDF

Comparison of Adult and Child's Speech Recognition of Korean (한국어에서의 성인과 유아의 음성 인식 비교)

  • Yoo, Jae-Kwon;Lee, Kyoung-Mi
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.5
    • /
    • pp.138-147
    • /
    • 2011
  • While most Korean speech databases are developed for adults' speech, not for children's speech, there are various children's speech databases based on other languages. Because there are wide differences between children's and adults' speech in acoustic and linguistic characteristics, the children's speech database needs to be developed. In this paper, to find the differences between them in Korean, we built speech recognizers using HMM and tested them according to gender, age, and the presence of VTLN(Vocal Tract Length Normalization). This paper shows the speech recognizer made by children's speech has a much higher recognition rate than that made by adults' speech and using VTLN helps to improve the recognition rate in Korean.

A Comparison of Resonance Parameters before and after Pharyngeal Flap Surgery:A Preliminary Report (인두피판술 전.후의 공명파라미터의 비교: 예비연구)

  • Kang, Young-Ae;Kang, Nak-Heon;Lee, Tae-Yong;Seong, Cheol-Jae
    • Phonetics and Speech Sciences
    • /
    • v.1 no.3
    • /
    • pp.133-144
    • /
    • 2009
  • Pharyngeal flap surgery changes the space and shape of the oral cavity and vocal tract, and these changing conditions bring resonance change. The purpose of this study was to determine the most reliable and valuable parameters for evaluating hypernasality to distinguish two patients before and after pharyngeal flap surgery. Each patient was asked to clearly speak the vowels /a/, /i/, /u/, /e/, /o/ for voice recording. There were nine parameters: Formant (F1, F2, F3), Bandwidth (BW1, BW2, BW3), LPC energy slope ($\Delta$ |A2-A1/F2-F1|), and Band Energy (0-500 Hz, 500-1000 Hz) by each vowel. From the results of discrimination analyses on acoustic parameters, the vowels /a/, /e/ appeared to be insignificant but vowels /i/, /u/, /o/ appeared to be efficient in the separation. A 95%, 100%, and 100% recognition score could be reached when vowels /i/, /u/, and /o/ were analyzed. The results showed that F2, BW3, and LPC slope are more important parameters than the others. Finally, there is a relation between perceptual evaluation score and LPC energy slope of acoustic parameters by least square slope.

  • PDF

Classification of Diphthongs using Acoustic Phonetic Parameters (음향음성학 파라메터를 이용한 이중모음의 분류)

  • Lee, Suk-Myung;Choi, Jeung-Yoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.2
    • /
    • pp.167-173
    • /
    • 2013
  • This work examines classification of diphthongs, as part of a distinctive feature-based speech recognition system. Acoustic measurements related to the vocal tract and the voice source are examined, and analysis of variance (ANOVA) results show that vowel duration, energy trajectory, and formant variation are significant. A balanced error rate of 17.8% is obtained for 2-way diphthong classification on the TIMIT database, and error rates of 32.9%, 29.9%, and 20.2% are obtained for /aw/, /ay/, and /oy/, for 4-way classification, respectively. Adding the acoustic features to widely used Mel-frequency cepstral coefficients also improves classification.

A Proposition of the Fuzzy Correlation Dimension for Speaker Recognition (화자인식을 위한 퍼지상관차원 제안)

  • Yoo, Byong-Wook;Kim, Chang-Seok;Park, Hyun-Sook
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.1
    • /
    • pp.115-122
    • /
    • 1999
  • In this paper, we confirmed that a speech signal is a chaos signal, and in order to use it as a speaker recognition parameter, analyzed chaos dimension. In order to raise speaker identification and pattern recognition, by making up the strange attractor involving an individual's vocal tract characteristics very well and applying fuzzy membership function to correlation dimension, we proposed fuzzy correlation dimension. By estimating the correlation of the points making up an attractor are limited according space dimension value, fuzzy correlation dimension absorbed the variation of the reference pattern attractor and test pattern attractor. Concerning fuzzy correlation dimension, by estimating the distance according to the average value of discrimination error per each speaker and reference pattern, investigated the validity of speaker recognition parameter.

  • PDF