• Title/Summary/Keyword: Formant frequency

Search Result 183, Processing Time 0.024 seconds

Relationship between Formants and Constriction Areas of Vocal Tract in 9 Korean Standard Vowels (우리말 모음의 발음시 음형대와 조음위치의 관계에 대한 연구)

  • 서경식;김재영;김영기
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.5 no.1
    • /
    • pp.44-58
    • /
    • 1994
  • The formants of the 9 Korean standard vowels(which used by the average people of Seoul, central-area of the Korean peninsula) were measured by analysis with the linear predictive coding(LPC) and fast Fourier transform(FFT). The author already had reported the constriction area for the Korean standard vowels, and with the existing data, the distance from glottis to the constriction area in the vocal tract of each vowel was newly measured with videovelopharyngograms and lateral Rontgenograms of the vocal tract. We correlated the formant frequencies with the distance from glottis to the constriction area of the vocal tract. Also we tried to correlate the formant frequencies with the position of tongue in the vocal tract which is divided into 2 categories : The position of tongue in oral cavity by the distance from imaginary palatal line to the highest point of tongue and the position in pharyngeal cavity by the distance from back of tongue to posterior pharyngeal wall. This study was performed with 10 adults(male : 5, female : 5) who spoke primary 9 Korean standard vowels. We had already reported that the Korean vowel [i], [e], $[{\varepsilon}]$ were articulated at hard palate level, [$\dot{+}$], [u] were at soft palate level, [$\wedge$] was at upper pharynx level and the [$\wedge$], [$\partial$], [a] in a previous article. Also we had noted that the significance of pharyngeal cavity in vowel articulation. From this study we have concluded that ; 1) The F$_1$ is related with the oral cavity articulated vowel [i, e, $\varepsilon$, $\dot{+}$, u]. 2) Within the oral cavity articulated vowel [i, e, $\varepsilon$, $\dot{+}$, u] and the upper pharynx articulated vowel [o], the F$_2$ is elevated when the diatance from glottis to the constriction area is longer. But within the lower pharynx articulated vowel [$\partial$, $\wedge$, a], the F$_2$ is elevated when the distance from glottis to the constriction area is shorter. 3) With the stronger tendency of back-vowel, the higher the elevation of the F$_1$ and F$_2$ frequencies. 4) The F$_3$ and F$_4$ showed no correaltion with the constriction area nor the position of tongue in the vocal tract 5) The parameter F$_2$- F$_1$, which is the difference between F$_2$ frequency and F$_1$ frequency showed an excellent indicator of differenciating the oral cavity articulated vowels from pharyngeal cavity articulated vowels. If the F$_2$-F$_1$ is less than about 600Hz which indicates the vowel is articulated in the pharyngeal cavity, and more than about 600Hz, which indicates that the vowel is articulated in the oral cavity.

  • PDF

Influence of Sexual Desire Caused by Watching Phonography on Human Body (음란물 시청으로 야기된 성욕이 인체에 미치는 영향)

  • Kim, Bong Hyun;Cho, Dong Uk;Kim, Hee Dae;Lee, Bum Joo;Park, Young;Jeong, Yeon Man
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.4
    • /
    • pp.831-837
    • /
    • 2017
  • The development of various electronic media such as the Internet and smart phones, each kinds of media informations has been accompanied by the fact that various types of media information are provided from one media, and on the other hand, various dysfunctions including smart phone addiction are also caused by a very large social problem. Especially, one of the biggest dysfunctions is the social crime problem such as sex crime caused by increased sexual desire according to watch the phonography, and even if it is not a social crime, watching the phonography has influenced bad mental and physical on human body. In this paper, we try to analyze what kind of change occurs in the voice in order to investigate what kind of bad influence it has on the human body after watching the phonography. In other words, the voice in the human body is the place where the human body signal is most expressed with the face. Therefore, the purpose of this study is to investigate the effects on the organs of the human body by comparing the change of voice before and after watching phonography. Experimental results showed that the stress hormone was increased by the inability to resolve sexual desire after watching the phonography, which resulted in an increase in the bandwidth of the 3rd formant frequency.

A Comparative Study of the Speech Signal Parameters for the Consonants of Pyongyang and Seoul Dialects - Focused on "ㅅ/ㅆ" (평양 지역어와 서울 지역어의 자음에 대한 음성신호 파라미터들의 비교 연구 - "ㅅ/ ㅆ"을 중심으로)

  • So, Shin-Ae;Lee, Kang-Hee;You, Kwang-Bock;Lim, Ha-Young
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.6
    • /
    • pp.927-937
    • /
    • 2018
  • In this paper the comparative study of the consonants of Pyongyang and Seoul dialects of Korean is performed from the perspective of the signal processing which can be regarded as the basis of engineering applications. Until today, the most of speech signal studies were primarily focused on the vowels which are playing important role in the language evolution. In any language, however, the number of consonants is greater than the number of vowels. Therefore, the research of consonants is also important. In this paper, with the vowel study of the Pyongyang dialect, which was conducted by phonological research and experimental phonetic methods, the consonant studies are processed based on an engineering operation. The alveolar consonant, which has demonstrated many differences in the phonetic value between Pyongyang and Seoul dialects, was used as the experimental data. The major parameters of the speech signal analysis - formant frequency, pitch, spectrogram - are measured. The phonetic values between the two dialects were compared with respect to /시/ and /씨/ of Korean language. This study can be used as the basis for the voice recognition and the voice synthesis in the future.

The comparative Study of the Acoustic Representation between Pansori singer's and Spasmodic dysphonia patient's Voice (병적인 소리 떨림증과 소리꾼 떨림증의 음향학적인 비교연구)

  • Hong, K.H.;Kim, H.G.;Lee, J.K.;Choi, J.S.
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.143-145
    • /
    • 2007
  • Muscle groups that are located in and around the vocal tract can produce audible changes in frequency and/or intensity of the voice. Vocal vibrato is a characteristic feature in the singing of performers trained in the western classical tradition and vibrato is generally considered to result from modulation in frequency amplitude and timbre. Vocal tremor is also characterized by periodic fluctuations in the voice frequency or intensity and vocal tremor is symptom of a neurological disease as Spasmodic dysphonia , Parkinson's disease. Vocal vibrato and Vocal tremor may have many of the same origins and mechanisms in the voice production systems. The purpose of this study is to find acostic character of Korean traditional song Pansori singer's vibrato and Spasmodic dysphonia patient's vocal tremor. twelve Pansori singers and seven Spasmodic dysponia patients participated to this study. Power spectrum and Real time Spectrogram are used to analyze the acoustic characteristics of Pansori singing and Spasmodic dysphonia patient's voice The results are as follows; First, vowel formant differences between Pansori singing and Spasmodic dysphonia patient's voice are higher F1, F3. Second, The vibrato rate show differences between Pansori singing and Spasmodic dysphonia patients;$4^{\sim}6/sec$ and $5{\sim}6/sec$ Vibrato rate of pitch is 5.7 Hz ${\sim}$ 42.4 Hz for Pansori singing , 3.8 Hz ${\sim}$ 27.9 Hz for Spasmodic dysphonia patients ;Vibrato rate of intensity range is 0.07 dB ${\sim}$ 8.26 dB for Pansori singing and 0.07 dB ${\sim}$ 4.81 dB for Spasmodic dysphonia patients

  • PDF

Phoneme Segmentation in Consideration of Speech feature in Korean Speech Recognition (한국어 음성인식에서 음성의 특성을 고려한 음소 경계 검출)

  • 서영완;송점동;이정현
    • Journal of Internet Computing and Services
    • /
    • v.2 no.1
    • /
    • pp.31-38
    • /
    • 2001
  • Speech database built of phonemes is significant in the studies of speech recognition, speech synthesis and analysis, Phoneme, consist of voiced sounds and unvoiced ones, Though there are many feature differences in voiced and unvoiced sounds, the traditional algorithms for detecting the boundary between phonemes do not reflect on them and determine the boundary between phonemes by comparing parameters of current frame with those of previous frame in time domain, In this paper, we propose the assort algorithm, which is based on a block and reflecting upon the feature differences between voiced and unvoiced sounds for phoneme segmentation, The assort algorithm uses the distance measure based upon MFCC(Mel-Frequency Cepstrum Coefficient) as a comparing spectrum measure, and uses the energy, zero crossing rate, spectral energy ratio, the formant frequency to separate voiced sounds from unvoiced sounds, N, the result of out experiment, the proposed system showed about 79 percents precision subject to the 3 or 4 syllables isolated words, and improved about 8 percents in the precision over the existing phonemes segmentation system.

  • PDF

Automatic Vowel Onset Point Detection Based on Auditory Frequency Response (청각 주파수 응답에 기반한 자동 모음 개시 지점 탐지)

  • Zang, Xian;Kim, Hag-Tae;Chong, Kil-To
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.1
    • /
    • pp.333-342
    • /
    • 2012
  • This paper presents a vowel onset point (VOP) detection method based on the human auditory system. This method maps the "perceptual" frequency scale, i.e. Mel scale onto a linear acoustic frequency, and then establishes a series of Triangular Mel-weighted Filter Bank simulate the function of band pass filtering in human ear. This nonlinear critical-band filter bank helps greatly reduce the data dimensionality, and eliminate the effect of harmonic waves to make the formants more prominent in the nonlinear spaced Mel spectrum. The sum of mel spectrum peaks energy is extracted as feature for each frame, and the instinct at which the energy amplitude starts rising sharply is detected as VOP, by convolving with Gabor window. For the single-word database which contains 12 vowels articulated with different kinds of consonants, the experimental results showed a good average detection rate of 72.73%, higher than other vowel detection methods based on short-time energy and zero-crossing rate.

Relationship between roar sound characteristics and body size of Steller sea lion

  • Park, Tae-Geon;Iida, Kohji;Mukai, Tohru
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.46 no.4
    • /
    • pp.458-465
    • /
    • 2010
  • Hundreds of Steller sea lions, Eumetopias jubatus, migrate from Sakhalin and the northern Kuril Islands to Hokkaido every winter. During this migration, they may use their roaring sounds to navigate and to maintain their groups. We recorded the roars of wild Steller sea lions that had landed on reefs on the west coast of Hokkaido, and those of captive sea lions, while making video recordings. A total of 300 roars of wild sea lions and 870 roars of captive sea lions were sampled. The fundamental frequency ($F_0$), formant frequency ($F_1$), pulse repetition rate (PRR), and duration of syllables (T) were analyzed using a sonagraph. $F_0$, $F_1$, and PRR of the roars emitted by captive sea lions increased in the order male, female, and juvenile. By contrast, the $F_1$ of wild males was lower than that of females, while the $F_0$ and PRR of wild males and females did not differ statistically. Moreover, the $F_0$ and $F_1$ frequencies for captive sea lions were higher than those of wild sea lions, while PRR in captive sea lions was lower than in wild sea lions. Since there was a linear relationship between body length and the $F_0$ and $F_1$ frequencies in captive sea lions, the body length distribution of wild sea lions could be estimated from the $F_0$ and $F_1$ frequency distribution using a regression equation. These results roughly agree with the body length distribution derived from photographic geometry. As the volume of the oral cavity and the length of the vocal cords are generally proportional to body length, sampled roars can provide useful information about a population, such as the body length distribution and sex ratio.

Spectral Modeling of Haegeum Using Cepstral Analysis (캡스트럼 분석을 이용한 해금의 스펙트럼 모델링)

  • Hong, Yeon-Woo;Kang, Myeong-Su;Cho, Sang-Jin;Kim, Jong-Myon;Lee, Jung-Chul;Chong, Ui-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.4
    • /
    • pp.243-250
    • /
    • 2010
  • This paper proposes a spectral modeling of Korean traditional instrument, Haegeum, using cepstral analysis to naturally describe Haegeum sounds varying with time. To get a precise result of cepstral analysis, we set the frame size to 3 periods of input signal and more cepstral coefficients are used to extract formants. The performance is enhanced by flexibly controlling the cutoff frequency of bandpass filter depending on the resonances in the synthesis process of sinusoidal components and the deleting peaks remained in the residual signal. To detect the change of pitch, we divide the input frames into silence, attack, and sustain region and determine which region the current frame is involved in. Then, the proposed method readjusts the frame size according to the fundamental frequency in the case of the current frame is in attack region and corrects the extraction errors of the fundamental frequency for the frames in sustain region. With these processes, the synthesized sounds are much more similar to the originals. The evaluation result through the listening test by a Haegeum player says that the synthesized sounds are almost similar to originals (96~100 % similar to the original sounds).

A Study On Male-To-Female Voice Conversion (남녀 음성 변환 기술연구)

  • Choi Jung-Kyu;Kim Jae-Min;Han Min-Su
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.115-118
    • /
    • 2000
  • Voice conversion technology is essential for TTS systems because the construction of speech database takes much effort. In this paper. male-to-female voice conversion technology in Korean LPC TTS system has been studied. In general. the parameters for voice color conversion are categorized into acoustic and prosodic parameters. This paper adopts LSF(Line Spectral Frequency) for acoustic parameter, pitch period and duration for prosodic parameters. In this paper. Pitch period is shortened by the half, duration is shortened by $25\%, and LSFs are shifted linearly for the voice conversion. And the synthesized speech is post-filtered by a bandpass filter. The proposed algorithm is simpler than other algorithms. for example, VQ and Neural Net based methods. And we don't even need to estimate formant information. The MOS(Mean Opinion Socre) test for naturalness shows 2.25 and for female closeness, 3.2. In conclusion, by using the proposed algorithm. male-to-female voice conversion system can be simply implemented with relatively successful results.

  • PDF

A SPEECH-PHONETIC STUDY ON THE PRONUNCIATION OF THE OPENBITE PATIENTS (개교환자의 발성에 관한 언어 음성학적 연구)

  • Kim, Ki-Dal;Yang, Won Sik
    • The korean journal of orthodontics
    • /
    • v.21 no.2 s.34
    • /
    • pp.287-307
    • /
    • 1991
  • This study aimed at examining speech defects of openbite patients, which were analized in terms of formant frequency for vowels and word pronunciation length for consonants. In addition, the upper and lower lip (perioral m.) activity was tested by the EMG. The tongue force was measured by the strain gauge, and the speech discrimination test was carried out. One experimental group and one control group were used for this study and they were respectively composed of six female openbite patients and six normal-occlusion females. Eight monophthongs, two fricatives and two affricatives were chosen for speech analysis. Speeches of the above-mentioned groups were recorded and then analized by the ILS/PC-1 software. Four hundred most frequently used monosyllables were also chosen for discrimination score. Openbite patients showed the following characteristics: 1. Abnormality in case of /a/, $/\varepsilon/$, /e/, /i/ $F_2$ and /e/, /a/ $F_1$. 2. Significantly elongated length in their pronunciation of /h/ and $/C^h/$ and somewhat elongated length also in their pronunciation of /s/ and /c/. 3. Significant upper lip activity according to the EMG test during pronunciation of the bilabial consonants. 4. Relatively weak tongue force according to the strain gauge measurement. 5. According to the speech discrimination test, high rate of misarticulation in case of (a) initial /p/ /s'/ and /ts'/, (b) /a/,$/\varepsilon/$,/e/,/je/,/o/, $/\phi/$,/jo/,/u/,/we/, and /i/ (c) final (equation omitted).

  • PDF