• 제목/요약/키워드: coarticulation

검색결과 40건 처리시간 0.021초

연속된 수화 인식을 위한 자동화된 Coarticulation 검출 (Automatic Coarticulation Detection for Continuous Sign Language Recognition)

  • 양희덕;이성환
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제36권1호
    • /
    • pp.82-91
    • /
    • 2009
  • 수화 적출은 연속된 손 동작에서 의미 있는 수화 단어를 검출 및 인식하는 것을 말한다. 수화는 손의 움직임과 모양의 변화가 다양하기 때문에 수화 문장에서 수화를 적출하는 것은 쉬운 문제가 아니다. 특히, 자연스러운 수화 문장에는 의미 있는 수화, 수화가 아닌 손동작이 무작위로 발생한다. 본 논문에서는 CRF(Conditional Random Field)에 기반한 적응적 임계치 모델을 제안한다. 제한된 모델은 수화 어휘집에 정의된 수화 손동작과 수화가 아닌 손동작을 구별하기 위한 적응적 임계치 역할을 수행한다. 또한, 수화 적출 및 인식의 성능 향상을 위해 손 모양 기반 수화 인증기, 짧은 수화 적출기, 부사인(subsign) 추론기를 제안된 시스템에 적용하였다. 실험 결과, 제안된 방법은 연속된 수화 동작 데이타에서 88%의 적출률, 사전에 적출된 수화 동작 데이타에서 94%의 인식률을 보였으며, 적응적 임계치 모델, 짧은 수화 적출기, 손 모양 기반 수화 인증기, 부사인 추론기를 사용하지 않은 CRF 모델은 연속된 수화 동작 데이터에서 74%의 적출률, 사전에 적출된 수화 동작 데이타에서 90%의 인식률을 보였다.

문맥종속 반음소단위에 의한 음운 자동 레이블링 시스템의 성능 개선 (Improvement of automatic phoneme labeling system using context-dependent demiphone unit)

  • 박순철;김봉완;이용주
    • 대한음성학회지:말소리
    • /
    • 제37호
    • /
    • pp.23-48
    • /
    • 1999
  • To improve the performance of automatic labelling system, the context-dependent demiphone unit was proposed. A phone is divided into two parts: a left demiphone that accounts for the left side coarticulation and a right demiphone that copes with the right side context. Demiphone unit provides a better training of the transition between phones. In this paper, If the length of the phone is less than 120 msec, it is split into two demiphones. If the length of the phone is greater than 120 msec, it is divided into three parts. In order to evaluate the performance of the system, we use 452 phonetically balanced words(PBW) database for training and testing phoneme models. According to the experiment, the system using proposed demiphone unit compared with that using old demiphone unit gains 3.83% improved result(71.63%) within 10ms of the duo boundary, and 2.20% improved result(86.41%) within 20ms of the true boundary.

  • PDF

키프레임 얼굴영상을 이용한 시청각음성합성 시스템 구현 (Implementation of Text-to-Audio Visual Speech Synthesis Using Key Frames of Face Images)

  • 김명곤;김진영;백성준
    • 대한음성학회지:말소리
    • /
    • 제43호
    • /
    • pp.73-88
    • /
    • 2002
  • In this paper, for natural facial synthesis, lip-synch algorithm based on key-frame method using RBF(radial bases function) is presented. For lips synthesizing, we make viseme range parameters from phoneme and its duration information that come out from the text-to-speech(TTS) system. And we extract viseme information from Av DB that coincides in each phoneme. We apply dominance function to reflect coarticulation phenomenon, and apply bilinear interpolation to reduce calculation time. At the next time lip-synch is performed by playing the synthesized images obtained by interpolation between each phonemes and the speech sound of TTS.

  • PDF

한국어 연속 숫자음 전화 음성 인식에서의 오인식 유형 분석 (Analysis of Error Patterns in ]Korean Connected Digit Telephone Speech Recognition)

  • 김민성;정성윤;손종목;배건성;김상훈
    • 대한음성학회지:말소리
    • /
    • 제46호
    • /
    • pp.77-86
    • /
    • 2003
  • Channel distortion and coarticulation effect in the Korean connected digit telephone speech make it difficult to achieve high performance of connected digit recognition in the telephone environment. In this paper, as a basic research to improve the recognition performance of Korean connected digit telephone speech, recognition error patterns are investigated and analyzed. Korean connected digit telephone speech database released by SiTEC and HTK system are used for recognition experiments. Both DWFBA and MRTCN methods are used for feature extraction and channel compensation, respectively. Experimental results are discussed with our findings.

  • PDF

악리론으로 본 정음창제와 정음소 분절 알고리즘 (Ortho-phonic Alphabet Creation by the Musical Theory and its Segmental Algorithm)

  • 진용옥;안정근
    • 음성과학
    • /
    • 제8권2호
    • /
    • pp.49-59
    • /
    • 2001
  • The phoneme segmentation is a very difficult problem in speech sound processing because it has found out segmental algorithm in many kinds of allophone and coarticulation's trees. Thus system configuration for the speech recognition and voice retrieval processing has a complex system structure. To solve it, we discuss a possibility of new segmental algorithm, which is called the minus a thirds one or plus in tripartitioning(삼분손익) of twelve temporament(12 율려), first proposed by Prof. T. S. Han. It is close to oriental and western musical theory. He also has suggested a 3 consonant and 3 vowel phonemes in Hunminjungum(훈민정음) invented by the King Sejong in the 15th century. In this paper, we suggest to newly name it as ortho-phonic phoneme(OPP/정음소), which carries the meaning of 'the absoluteness and independency'. OPP also is acceptable to any other languages, for example IPA. Lastly we know that this algorithm is constantly applicable to the global language and is very useful to construct a voice recognition and retrieval structuring engineering.

  • PDF

On Tensity of Korean Fricatives (Electropalatographic Study)

  • Baik, Woon-Il
    • 음성과학
    • /
    • 제4권1호
    • /
    • pp.135-145
    • /
    • 1998
  • An Electropalatographic (EPG) study was conducted to investigate the articulatory characteristics which determine the, distinction between the Korean lax fricative [s] and tense fricative [s']. This study also intended to test if an increase in the degree of tensity (lax fricative [s] < tense fricative [s']) induces a decrease in coarticulatory vocalic effects. The results indicated that the increase in the tensity of Korean fricatives is closely related to the increase in the narrowness of the groove width (wider contact at the place of articulation), the forward shifting in the place of articulation, and the longer duration of the constriction (longer maintenance in the manner of articulation). It was also found that coarticulatory vocalic effects on Korean fricatives are affected by Recasens' two rules of constraint (1983) : spatial and temporal constraints.

  • PDF

Locus equation -as a phonetic descriptor for place articulation in Arabic.

  • Kassem Wahba
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 10월 학술대회지
    • /
    • pp.206-206
    • /
    • 1996
  • Previous studies of American English(e.g. Sussman 1991, 1993, 1994) CVC coarticulation with initial consonants representing the labial, alveolar, and velar showed a linear relationship that fits to data points formed by plotting onsets of F2 transition along the y-axis and their corresponding midvowel points along the x-axis. The present study extends the locus equation metric to include the following places of articulation:uvular, pharyngeal, laryngeal, and emphatics. The question of interest is to determine if locus equation could serve as phonetic descriptor for the place of articulation in Arabic. Five male native speakers of Colloquial Egyptian Arabic(CEA) read a list of 204 CVC and CVCC words, containing eight different places of articulation and eight vowels. Average of formant patterns(Fl,F2,F3) onsets, midpoints, and offsets were calculated, using wide band spectrograms obtained by means of the kay spectrograph model(7029), and plotted as locus equations. A summary of the acoustic properties of the place of articulation of CEA will be presented in the frames of bVC and CVb. Strong linear regression relationships were found for every place of articulation.

  • PDF

배경 잡음하에서 스펙트럼 누설현상을 이용한 음성신호의 중심 피치 검출 (On the Center Pitch Estimation by using the Spectrum Leakage Phenomenon for the Noise Corrupted Speech Signals)

  • 강동규;배명진;안수길
    • 한국음향학회지
    • /
    • 제10권1호
    • /
    • pp.37-46
    • /
    • 1991
  • 지금까지 제안된 피치 수정 앨고리즘들은 남녀노소에 무관하게 폭넓은 피치범위를 검출하기가 어렵다. 조음기관의 물리적 한계성 때문에 피치의 분포도는 일반적으로 중심 피치에 몰려있는 형태가 된다. 이 중심 피치를 본격적인 피치검출 과정에 적용한다면 그 처리과정이 간단해지고 정확도가 개선될 수 있다. 본 논문에서는 스펙트럼 누설현상을 이용하여 중심피치를 정확하게 검출하는 앨고리즘을 제안한다.

  • PDF

반음절기반의 한국어 연속숫자음인식과 그 후처리에 대한 연구 (A Study on Korean Connected Digit Recognizer Based on Semi-syllable and Post-processing)

  • 정재부;정훈;정익주
    • 음성과학
    • /
    • 제8권4호
    • /
    • pp.1-15
    • /
    • 2001
  • This paper describes the effect of new recognition unit, a unit based on semisyllable, and its post processing method. A recognition unit based on semi-syllable expresses Korean connected digit's coarticulation effect. An existing method using semi-syllable limits next models, derived from current recognized models, to make complete connected digit sequence. However, this paper uses a new method to make complete connected digit sequence. The new post-processing method recognizes isolated digit words which include digits sequence from the digit combinations being able to occur from current recognized semi-syllable sequence. This method gives an improved accuracy rate than that of existing method. This new post processing provides two advantages. 1) It corrects current mis-recognized semi-syllable unit. 2) When people say each digit, they say it without regard to saying duration.

  • PDF

연결숫자음 전화음성 인식에서의 오인식 유형 분석 (Analysis of Error Patterns in Korean Connected Digit Telephone Speech Recognition)

  • 김민성;정성윤;손종목;배건성;김상훈
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.115-118
    • /
    • 2003
  • Channel distortion and coarticulation effect in the connected digit telephone speech make it difficult to recognize, and degrade recognition performance in the telephone environment. In this paper, as a basic research to improve the recognition performance of Korean connected digit telephone, error patterns are investigated and analyzed. Telephone digit speech database released by SITEC with HTK system is used for recognition experiments. Both DWFBA and MRTCN methods are used for feature extraction and channel compensation, respectively. Experimental results are discussed with our findings.

  • PDF