Search | Korea Science

Statistical Analysis of Korean Phonological Rules Using a Automatic Phonetic Transcription (발음열 자동 변환을 이용한 한국어 음운 변화 규칙의 통계적 분석)

Lee Kyong-Nim;Chung Minhwa
- Proceedings of the KSPS conference
- /
- 2002.11a
- /
- pp.81-85
- /
- 2002
We present a statistical analysis of Korean phonological variations using automatic generation of phonetic transcription. We have constructed the automatic generation system of Korean pronunciation variants by applying rules modeling obligatory and optional phonemic changes and allophonic changes. These rules are derived from knowledge-based morphophonological analysis and government standard pronunciation rules. This system is optimized for continuous speech recognition by generating phonetic transcriptions for training and constructing a pronunciation dictionary for recognition. In this paper, we describe Korean phonological variations by analyzing the statistics of phonemic change rule applications for the 60,000 sentences in the Samsung PBS(Phonetic Balanced Sentence) Speech DB. Our results show that the most frequently happening obligatory phonemic variations are in the order of liaison, tensification, aspirationalization, and nasalization of obstruent, and that the most frequently happening optional phonemic variations are in the order of initial consonant h-deletion, insertion of final consonant with the same place of articulation as the next consonants, and deletion of final consonant with the same place of articulation as the next consonants. These statistics can be used for improving the performance of speech recognition systems.
PDF

Shapes of Vowel F0 Contours Influenced by Preceding Obstruents of Different Types - Automatic Analyses Using Tilt Parameters-

Jang, Tae-Yeoub
- Speech Sciences
- /
- v.11 no.1
- /
- pp.105-116
- /
- 2004
The fundamental frequency of a vowel is known to be affected by the identity of the preceding consonant. The general agreement is that strong consonants trigger higher F0 than weak consonants. However, there has been a disagreement on the shape of this segmentally affected F0 contours. Some studies report that shapes of contours are differentiated based on the consonant type, but others regard this observation as misleading. This research attempts to resolve this controversy by investigating shapes and slopes of F0 contours of Korean word level speech data produced by four male speakers. Instead of entirely relying on traditional human intuition and judgment, I employed an automatic F0 contour analysis technique known as tilt parameterisation (Taylor 2000). After necessary manipulation of an F0 contour of each data token, various parameters are collapsed into a single tilt value which directly indicates the shape of the contour. The result, in terms of statistical inference, shows that it is not viable to conclude that the type of consonant is significantly related to the shape of F0 contour. A supplementary measurement is also made to see if the slope of each contour bears meaningful information. Unlike shapes themselves, slopes are suspected to be practically more practical for consonantal differentiation, although confirmation is required through further refined experiments.
PDF

Language Specific Variations of Domain-initial Strengthening and its Implications on the Phonology-Phonetics Interface: with Particular Reference to English and Hamkyeong Korean

Kim, Sung-A
- Speech Sciences
- /
- v.11 no.3
- /
- pp.7-21
- /
- 2004
The present study aims to investigate domain-initial strengthening phenomenon, which refers to strengthening of articulatory gestures at the initial positions of prosodic domains. More specifically, this paper presents the result of an experimental study of initial syllables with onset consonants (initial-syllable vowels henceforth) of various prosodic domains in English and Hamkyeong Korean, a pitch accent dialect spoken in the northern part of North Korea. The durations of initial-syllable vowels are compared to those of second vowels in real-word tokens for both languages, controlling both stress and segmental environment. Hamkyeong Korean, like English, tuned out to strengthen the domain-initial consonants. With regard to vowel durations, no significant prosodic effect was found in English. On the other hand, Hamkyeong Korean showed significant differences between the durations of initial and non-initial vowels in the higher prosodic domains. The theoretical implications of the findings are as follows: The potentially universal phenomenon of initial strengthening is shown to be subject to language specific variations in its implementation. More importantly, the distinct phonetics- phonology model (Pierrehumbert & Beckman, 1998; Keating, 1990; Cohn, 1993) is better equipped to account for the facts in the present study.
PDF

Compensation in VC and Word

Yun, Il-Sung
- Phonetics and Speech Sciences
- /
- v.2 no.3
- /
- pp.81-89
- /
- 2010
Korean and three other languages (English, Arabic, and Japanese) were compared with regard to the compensatory movements in a VC (Vowel and Consonant) sequence and word. For this, Korean data were collected from an experiment and the other languages' data from literature. All the test words of the languages had the same syllabic contexture, i.e., /CVCV(r)/, where C was an oral stop and intervocalic consonants were either bilabial or alveolar stops. The present study found that (1) Korean is most striking in the durational variations of segments (vowel and the following hetero-syllabic consonant); (2) unlike the three languages that show a constant sum of VC, Korean yields a three-way distinction in the length of VC according the type (lax unaspirated vs. tense unaspirated vs. tense aspirated) of the following stop consonant; (3) a durational constancy is maintained up to the word level in the three languages, but Korean word duration varies as a function of the feature tenseness of the intervocalic consonants; (4) consonant duration is proven to differentiate Korean the most from the other languages. It is suggested that the durational difference between a lax consonant and its tense cognate(s) and the degree of compensation between V and C are determined by the phonology in each language.
PDF

A Study On the Proportional Difference of Segments in Imitating Voice (모방발화에 나타나는 분절음의 비율연구)

Park, Ji-Hye;Shin, Ji-Young;Kang, Sun-Mee
- Proceedings of the KSPS conference
- /
- 2004.05a
- /
- pp.205-208
- /
- 2004
The aim of this study is to analyse the adjustment of the proportion of segment duration in imitating voice. When imitating others' voices, how far is his/her original proportion of segment duration adjusted, and what is this adjustment like under various segments? In this study, I classified segments into consonants and vowels and consonants classified into obstruents and sonorants. The result of the analysis is as follows. ; (1)Individual variation in the proportion of obstruent is not significant, and when imitating, and its distribution is not typicalized. (2) Vowels has individual variation in the proportion of segment duration even under imitating. (3) Nasal has the most distinct individual variation even under imitating, compared with vowel and obstruent. For the further study, I should examine the characteristics of quantitative and qualitative changes in liquid (among sonorant) to find out which segment can best describe personnel characteristics of the proportion of segment duration in imitating voice.
PDF

Acoustic Characteristics and Auditory Cues for Korean Lax vs. Tense Fricative Distinction (한국어 평마찰음과 경마찰음의 음향적 특성과 지각 단서 - 길이를 중심으로 -)

이경희;이봉원
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.2
- /
- pp.95-100
- /
- 2000
The purpose of this paper is to show their distinctive auditory cues. Until now research of acoustic characteristics has been confined to simple experiments concerning the restricted conditions. Therefore this paper examines all of the acoustic characteristics of the lax and tense Fricative Consonants and shows to how acoustic characteristics can be used to differentiate lax and tense Fricative Consonants. The results of this paper are (a) auditory cues are especially important if there is a large difference between acoustic characteristics, (b) the lax and tense Fricative Consonant's distinctive auditory cues contain a hierarchy, and (c) there is a different hierarchy between CV and VCV.
PDF

A Study on Approximation-Synthesis of Transition Segment in Speech Signal (음성신호에서 천이구간의 근사합성에 관한 연구)

Lee See-Woo
- The Journal of the Korea Contents Association
- /
- v.5 no.3
- /
- pp.167-173
- /
- 2005
In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including Unvoiced Consonant) extraction method by using pitch pulses and Zero Crossing Rate in order to unexistent with a voiced and unvoiced consonants in a frame. And this paper present a TSIUVC approximate-synthesis method by using frequency band division. As a result, this method obtains a high quality approximation-synthesis waveform within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. And the TSIUVC extraction rate was $91\%$ for female voice and $96.2\%$ for male voice respectively This method has the capability of being applied to a new speech coding of Voiced/Silence/TSIUVC, speech analysis, and speech synthesis.
PDF

A Study on TSIUVC Approximate-Synthesis Method using Least Mean Square and Frequency Division (주파수 분할 및 최소 자승법을 이용한 TSIUVC 근사합성법에 관한 연구)

이시우
- Journal of Korea Multimedia Society
- /
- v.6 no.3
- /
- pp.462-468
- /
- 2003
In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and an unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including Unvoiced Consonant) searching and extraction method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This paper present a new method of TSIUVC approximate-synthesis by using Least Mean Square and frequency band division. As a result, this method obtain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547KHz below and 2.813KHz above. The important thing is that the maximum error signal can be made with low distortion approximation-synthesis waveform within TSIUVC. This method has the capability of being applied to a new speech coding of Voiced/Silence/TSIUVC, speech analysis and speech synthesis.
PDF

A Study on the Speech Recognition of Korean Phonemes Using Recurrent Neural Network Models (순환 신경망 모델을 이용한 한국어 음소의 음성인식에 대한 연구)

김기석;황희영
- The Transactions of the Korean Institute of Electrical Engineers
- /
- v.40 no.8
- /
- pp.782-791
- /
- 1991
In the fields of pattern recognition such as speech recognition, several new techniques using Artifical Neural network Models have been proposed and implemented. In particular, the Multilayer Perception Model has been shown to be effective in static speech pattern recognition. But speech has dynamic or temporal characteristics and the most important point in implementing speech recognition systems using Artificial Neural Network Models for continuous speech is the learning of dynamic characteristics and the distributed cues and contextual effects that result from temporal characteristics. But Recurrent Multilayer Perceptron Model is known to be able to learn sequence of pattern. In this paper, the results of applying the Recurrent Model which has possibilities of learning tedmporal characteristics of speech to phoneme recognition is presented. The test data consist of 144 Vowel+ Consonant + Vowel speech chains made up of 4 Korean monothongs and 9 Korean plosive consonants. The input parameters of Artificial Neural Network model used are the FFT coefficients, residual error and zero crossing rates. The Baseline model showed a recognition rate of 91% for volwels and 71% for plosive consonants of one male speaker. We obtained better recognition rates from various other experiments compared to the existing multilayer perceptron model, thus showed the recurrent model to be better suited to speech recognition. And the possibility of using Recurrent Models for speech recognition was experimented by changing the configuration of this baseline model.

Analysis of Eigenvalues of Covariance Matrices of Speech Signals in Frequency Domain (음성 신호의 주파수 영역에서의 공분산행렬의 고유값 분석)

Kim, Seonil
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2015.05a
- /
- pp.47-50
- /
- 2015
Speech Signals consist of signals of consonants and vowels, but the lasting time of vowels is much longer than that of consonants. It can be assumed that the correlations between signal blocks in speech signal is very high. Each speech signal is divided into blocks which have 128 speech data. FFT is applied to each block. Low frequency areas of the results of FFT is taken and Covariance matrix between blocks in a speech signal is extracted and finally eigenvalues of those matrix are obtained. It is studied that what the distribution of eigenvalues of various speech files is. The differences between speech signals and noise signals from cars are also studied.
PDF

Search Result 457, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)