• 제목/요약/키워드: Phonemes

검색결과 227건 처리시간 0.027초

영어의 모음체계 연구 (A study of English vowel system)

  • 이재영
    • 대한음성학회지:말소리
    • /
    • 제38호
    • /
    • pp.71-97
    • /
    • 1999
  • In this paper I have surveyed vowel phonemes in a variety of English accents and have proposed the vowel systems of English. The English accents covered in this paper include General American English, Northeastern American English, Western American English, Southern British English, Northern British English, Scottish English, Southern Irish English, Northern Irish English, Australian English, and New Zealand English. The vowel systems proposed here reflect the acoustic information of vowels and phonological aspects of English. This paper offers an Optimality Theory-based analysis of the English vowel systems by appealing to independently motivated constraints. This paper, following Flemming(1995), makes an assumption that the vowel system in question is selected in output as an optimal candidate by a given constraint ranking, the assumption which is different from the view that the vowel system is fixed in input. The analysis proposed here gives an answer to why a specific vowel system is selected and why dialectal variations come about. It is shown in this paper that the vowel system selected in a specific dialect comes from an optimal satisfaction of a given constraint ranking and that dialectal differences result from dynamic permutations of the same constraints. The constraint-based analysis proffered here accounts well for the similarities and differences among dialects in regard to the vowel system.

  • PDF

통계적 방법과 인지실험을 통한 한국어 초성파열음의 음소단위 분석에 관한 연구 (A Study on the Phoneme Based Analysis of Korean Initial Plosives Using Statistical Method and Perception Tests)

  • 조철우;이우선;이규호;김종안;임광일;이태원
    • 한국음향학회지
    • /
    • 제8권5호
    • /
    • pp.78-85
    • /
    • 1989
  • 본 논문에서는 한국어의 규칙합성에 관한 연구중 파열음의 함성 파라미터를 추정하기 위하여 사용한 통계적 방법과 인지실험에 의한 방법에 관하여 기술하고 있다. 합성기로는 직렬 포만트 합성기를 구성하여 사용하였고 통제적 분석에 사용된 음성시료는 9개의 초성 파열음과 8개의 모음으로 구성되는 72개의 독립 CV형태를 갖는 단음절을 만일 화자를 통하여 수집하였다. 음성의 분석은 시간 및 주파수 평면에서 파라미터의 변화를 중심으로 행하였으며, 인지실험을 통한 파라미터 추정방법을 통하여 독립적으로 포만트 파라미터의 변화에 관하여 조사하였다.

  • PDF

음소 인식을 위한 특징 추출의 위치와 지속 시간 길이에 관한 연구 (A Study on Duration Length and Place of Feature Extraction for Phoneme Recognition)

  • 김범국;정현열
    • 한국음향학회지
    • /
    • 제13권4호
    • /
    • pp.32-39
    • /
    • 1994
  • 한국어 음성인식 시스템을 구현하기 위한 기초 연구로서 한국어 전음소를 대상으로 1) 각 음소의 특성을 가장 잘 나타내는 최적의 위치, 2) 최고의 인식률을 얻기 위한 적당한 지속시간길이를 찾기위해서 음소인식을 수행하였다. 인식실험을 위해 특징파라메터로 21차원 켑스트럼계수를 이용하여 베이즈 결정법칙으로서 세화자에 대한 종속인식실험을 행하였다. 인식실험결과 최고의 인식률을 보이는 최적의 특징추출의 위치는 모음에서는 10~50ms, 마찰음및 파찰음은 40~100ms, 비음, 유음은 10~50ms, 그리고 파열음은 10~50ms임을 알 수 있었다. 또, 35 전음소를 대상으로한 인식에 있어서는 최고의 인식률을 얻기위한 지속시간 정 보의 길이는 60~70ms정도가 충분함을 알 수 있었다.

  • PDF

타언어권 화자 음성 인식을 위한 혼잡도에 기반한 다중발음사전의 최적화 기법 (Optimizing Multiple Pronunciation Dictionary Based on a Confusability Measure for Non-native Speech Recognition)

  • 김민아;오유리;김홍국;이연우;조성의;이성로
    • 대한음성학회지:말소리
    • /
    • 제65호
    • /
    • pp.93-103
    • /
    • 2008
  • In this paper, we propose a method for optimizing a multiple pronunciation dictionary used for modeling pronunciation variations of non-native speech. The proposed method removes some confusable pronunciation variants in the dictionary, resulting in a reduced dictionary size and less decoding time for automatic speech recognition (ASR). To this end, a confusability measure is first defined based on the Levenshtein distance between two different pronunciation variants. Then, the number of phonemes for each pronunciation variant is incorporated into the confusability measure to compensate for ASR errors due to words of a shorter length. We investigate the effect of the proposed method on ASR performance, where Korean is selected as the target language and Korean utterances spoken by Chinese native speakers are considered as non-native speech. It is shown from the experiments that an ASR system using the multiple pronunciation dictionary optimized by the proposed method can provide a relative average word error rate reduction of 6.25%, with 11.67% less ASR decoding time, as compared with that using a multiple pronunciation dictionary without the optimization.

  • PDF

Decoding Brain States during Auditory Perception by Supervising Unsupervised Learning

  • Porbadnigk, Anne K.;Gornitz, Nico;Kloft, Marius;Muller, Klaus-Robert
    • Journal of Computing Science and Engineering
    • /
    • 제7권2호
    • /
    • pp.112-121
    • /
    • 2013
  • The last years have seen a rise of interest in using electroencephalography-based brain computer interfacing methodology for investigating non-medical questions, beyond the purpose of communication and control. One of these novel applications is to examine how signal quality is being processed neurally, which is of particular interest for industry, besides providing neuroscientific insights. As for most behavioral experiments in the neurosciences, the assessment of a given stimulus by a subject is required. Based on an EEG study on speech quality of phonemes, we will first discuss the information contained in the neural correlate of this judgement. Typically, this is done by analyzing the data along behavioral responses/labels. However, participants in such complex experiments often guess at the threshold of perception. This leads to labels that are only partly correct, and oftentimes random, which is a problematic scenario for using supervised learning. Therefore, we propose a novel supervised-unsupervised learning scheme, which aims to differentiate true labels from random ones in a data-driven way. We show that this approach provides a more crisp view of the brain states that experimenters are looking for, besides discovering additional brain states to which the classical analysis is blind.

확장된 버로우즈-휠러 변환을 이용한 개선된 한글 초성 탐색 (Improved First-Phoneme Searches Using an Extended Burrows-Wheeler Transform)

  • 김성환;조환규
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제20권12호
    • /
    • pp.682-687
    • /
    • 2014
  • 한글 초성 질의는 내비게이션 시스템이나 모바일 기기와 같이 입력 환경에 제약이 있어 오류가 빈번한 인터페이스 상에서 사용자 편의성 향상을 위하여 제공되는 중요한 기능이다. 본 논문에서는 한글 문자열을 자소 단위로 분해하여 재배열하여 환형 문자열로 변환한 후, 확장된 버로우즈-휠러 변환을 이용하여 색인함으로써 초성 질의 탐색을 위한 시공간 효율적인 자료구조를 제안한다. 또한 실험을 통하여 기존 기법에 비하여 더 적은 공간만을 사용하면서도 보다 다양한 형태의 질의를 처리할 수 있으며, 특히 질의어의 길이가 짧고, 초성의 비율이 높을수록 탐색 속도가 향상됨을 확인하였다.

분산 메모리 다중프로세서 환경에서의 병렬 음성인식 모델 (A Parallel Speech Recognition Model on Distributed Memory Multiprocessors)

  • 정상화;김형순;박민욱;황병한
    • 한국음향학회지
    • /
    • 제18권5호
    • /
    • pp.44-51
    • /
    • 1999
  • 본 논문에서는 음성과 자연언어의 통합처리를 위한 효과적인 병렬계산모델을 제안한다. 음소모델은 연속 Hidden Markov Model(HMM)에 기반을 둔 문맥종속형 음소를 사용하며, 언어모델은 지식베이스를 기반으로 한다. 또한 지식베이스를 구성하기 위해 계층구조의 semantic network과 병렬 marker-passing을 추론 메카니즘으로 쓰는 memory-based parsing 기술을 사용한다. 본 연구의 병렬 음성인식 알고리즘은 분산메모리 MIMD(Multiple Instruction Multiple Data) 구조의 다중 Transputer 시스템을 이용하여 구현되었다. 실험결과, 본 연구의 지식베이스 기반 음성인식 시스템의 인식률이 word network 기반 음성인식 시스템보다 높게 나타났으며 code-phoneme 통계정보를 활용하여 인식성능의 향상도 얻을 수 있었다. 또한, 성능향상도(speedup) 관련 실험들을 통하여 병렬 음성인식 시스템의 실시간 구현 가능성을 확인하였다.

  • PDF

순환 신경망 모델을 이용한 한국어 음소의 음성인식에 대한 연구 (A Study on the Speech Recognition of Korean Phonemes Using Recurrent Neural Network Models)

  • 김기석;황희영
    • 대한전기학회논문지
    • /
    • 제40권8호
    • /
    • pp.782-791
    • /
    • 1991
  • In the fields of pattern recognition such as speech recognition, several new techniques using Artifical Neural network Models have been proposed and implemented. In particular, the Multilayer Perception Model has been shown to be effective in static speech pattern recognition. But speech has dynamic or temporal characteristics and the most important point in implementing speech recognition systems using Artificial Neural Network Models for continuous speech is the learning of dynamic characteristics and the distributed cues and contextual effects that result from temporal characteristics. But Recurrent Multilayer Perceptron Model is known to be able to learn sequence of pattern. In this paper, the results of applying the Recurrent Model which has possibilities of learning tedmporal characteristics of speech to phoneme recognition is presented. The test data consist of 144 Vowel+ Consonant + Vowel speech chains made up of 4 Korean monothongs and 9 Korean plosive consonants. The input parameters of Artificial Neural Network model used are the FFT coefficients, residual error and zero crossing rates. The Baseline model showed a recognition rate of 91% for volwels and 71% for plosive consonants of one male speaker. We obtained better recognition rates from various other experiments compared to the existing multilayer perceptron model, thus showed the recurrent model to be better suited to speech recognition. And the possibility of using Recurrent Models for speech recognition was experimented by changing the configuration of this baseline model.

부모의 청각장애 유무에 따른 3, 4세 건청 자녀의 모음 및 파열음 조음의 음향음성학적 특성 비교: 예비연구 (Comparison of Acoustic Characteristics of Vowel and Stops in 3, 4 year-old Normal Hearing Children According to Parents' Deafness: Preliminary Study)

  • 홍지숙;강영애;김재옥
    • 말소리와 음성과학
    • /
    • 제7권1호
    • /
    • pp.67-77
    • /
    • 2015
  • The purpose of this study was to investigate how deaf parents influence the speech sounds of their normal-hearing children. Twenty four normal hearing children of deaf adults (CODA) and normal hearing parents (NORMAL) aged 3 to 4 participated in the study. The F1, F2, and the vowel triangle area in 7 vowels and the voice onset times (VOTs) and closure durations in 9 stops were measured. The results of the study are as follows. First, the F1 and F2 for all vowels were higher and the vowel triangle area was larger in CODA than in NORMAL although they were not statistically significant. Second, VOTs in $C_{stop}V$ for $/t^*/$ and in $VC_{stop}V$ for $/t^*/$, $/t^h/$, and $/k^h/$ were longer in CODA than in NORMAL. Most stops in CODA appeared to be longer VOTs for most phonemes. Third, the manner and place of articulation in stops did not make a difference between CODA and NORMAL in VOTs and closed durations. CODA does not demonstrate the speech characteristics of deaf people, however, they seem to speak differently than NORMAL, which means CODA might be influenced by a different linguistic environment created by deaf parents in some way.

영어전설고모음 인식에 대한 ERP 실험연구: 한국인과 영어원어민을 대상으로 (An ERP Study of the Perception of English High Front Vowels by Native Speakers of Korean and English)

  • 윤영도
    • 말소리와 음성과학
    • /
    • 제5권3호
    • /
    • pp.21-29
    • /
    • 2013
  • The mismatch negativity (MMN) is known to be a fronto-centrally negative component of the auditory event-related potentials (ERP). $N\ddot{a}\ddot{a}t\ddot{a}nen$ et al. (1997) and Winkler et al. (1999) discuss that MMN acts as a cue to a phoneme perception in the ERP paradigm. In this study a perception experiment based on an ERP paradigm to check how Korean and American English speakers perceive the American English high front vowels was conducted. The study found that the MMN obtained from both Korean and American English speakers was shown around the same time after they heard F1s of English high front vowels. However, when the same groups heard English words containing them, the American English listeners' MMN was shown to be a little faster than the Korean listeners' MMN. These findings suggest that non-speech sounds, such as F1s of vowels, may be processed similarly across speakers of different languages; however, phonemes are processed differently; a native language phoneme is processed faster than a non-native language phoneme.