• Title/Summary/Keyword: Phonemes

Search Result 227, Processing Time 0.03 seconds

Implementation of Text-to-Audio Visual Speech Synthesis Using Key Frames of Face Images (키프레임 얼굴영상을 이용한 시청각음성합성 시스템 구현)

  • Kim MyoungGon;Kim JinYoung;Baek SeongJoon
    • MALSORI
    • /
    • no.43
    • /
    • pp.73-88
    • /
    • 2002
  • In this paper, for natural facial synthesis, lip-synch algorithm based on key-frame method using RBF(radial bases function) is presented. For lips synthesizing, we make viseme range parameters from phoneme and its duration information that come out from the text-to-speech(TTS) system. And we extract viseme information from Av DB that coincides in each phoneme. We apply dominance function to reflect coarticulation phenomenon, and apply bilinear interpolation to reduce calculation time. At the next time lip-synch is performed by playing the synthesized images obtained by interpolation between each phonemes and the speech sound of TTS.

  • PDF

A STUDY ON THE SIMULATED ANNEALING OF SELF ORGANIZED MAP ALGORITHM FOR KOREAN PHONEME RECOGNITION

  • Kang, Myung-Kwang;Ann, Tae-Ock;Kim, Lee-Hyung;Kim, Soon-Hyob
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.407-410
    • /
    • 1994
  • In this paper, we describe the new unsuperivised learning algorithm, SASOM. It can solve the defects of the conventional SOM that the state of network can't converge to the minimum point. The proposed algorithm uses the object function which can evaluate the state of network in learning and adjusts the learning rate flexibly according to the evaluation of the object function. We implement the simulated annealing which is applied to the conventional network using the object function and the learning rate. Finally, the proposed algorithm can make the state of network converged to the global minimum. Using the two-dimensional input vectors with uniform distribution, we graphically compared the ordering ability of SOM with that of SASOM. We carried out the recognitioin on the new algorithm for all Korean phonemes and some continuous speech.

  • PDF

On Improving the Effects of Varying the Window Length on Speech Energy Computation (음성 에너지계산에서 창함수-길이 변화영향의 개선에 관한 연구)

  • Bae, Myung-Jin;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.9 no.2
    • /
    • pp.34-41
    • /
    • 1990
  • The energy parameter is widely used in pre-processing of speech signals, because it represent the phoneme characteristics of well But, the energy parameter is affected by the window length during the extracting. Thus, in this paper, the window length effects are studied in detail, and we proposed a new energy extraction algorithm that reduces the length effects. The energy contours with this algorithm are well representing for the characteristics of speech phonemes. And the computations to implement the algorithm are only required one subtraction, one addition, and two comparison aperation per speech sample.

  • PDF

A Study on the Simple Algorithm for Discrimination of Voiced Sounds (유성음 구간 검출을 위한 간단한 알고리즘에 관한 연구)

  • 장규철;우수영;박용규;유창동
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.8
    • /
    • pp.727-734
    • /
    • 2002
  • A simple algorithm for discriminating voiced sounds in a speech is proposed in this paper. In addition to low-frequency energy and zero-crossing rate (ZCR), both of which have been widely used in the past for identifying voiced sounds, the proposed algorithm incorporates pitch variation to improve the discrimination rate. Based on TIMIT corpus, evaluation result shows an improvement of 13% in the discrimination of voiced phonemes over that of the traditional algorithm using only energy and ZCR.

Phonological Awareness Activities Using Story Books : Effects on Reading, Self-Concept, and Learning Motivation in an After-School Program for 1st and 2nd Grade Low Income Children (동화를 이용한 음운인식활동이 저소득층 초등 방과후 교실 1, 2 학년 아동의 읽기, 학습동기 및 자아개념에 미치는 영향)

  • Lee, Jeehyun;Kim, Youjung;Lee, Jung A
    • Korean Journal of Child Studies
    • /
    • v.27 no.5
    • /
    • pp.123-141
    • /
    • 2006
  • The phonemic awareness program included construction of 45 activities emphasizing various sounds in speech and letter names using a storybook. The subjects were thirty 1st and 2nd grade low-income(15 experimental and 15 control group) children attending an after-school program in Seoul. Pre- and post-tests assessed children's reading, self-concept, and learning motivation. The experimental group children had rich opportunity to deal with and discuss sounds, syllables, phonemes, and the Korean alphabet names during storybook reading, games, and play over a 12 week period, while the control group children were provided with worksheets, subject tutoring, and homework guidance. Results showed that the phonemic activities were an effective and useful way to enhance children's reading ability, self-concept, and learning motivation.

  • PDF

An Optimization of Speech Database in Corpus-based speech synthesis sytstem (코퍼스기반 음성합성기의 데이터베이스 최적화 방안)

  • Jang Kyung-Ae;Chung Min-Hwa
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.209-213
    • /
    • 2002
  • This paper describes the reduction of DB without degradation of speech quality in Corpus-based Speech synthesizer of Korean language. In this paper, it is proposed that the frequency of every unit in reduced DB should reflect the frequency of units in Korean language. So, the target population of every unit is set to be proportional to their frequency in Korean large corpus(780K sentences, 45Mega phonemes). Second, the frequent instances during synthesis should be also maintained in reduced DB. To the last, it is proposed that frequency of every instance should be reflected in clustering criterion and used as criterion for selection of representative instances. The evaluation result with proposed methods reveals better quality than using conventional methods.

  • PDF

An Implementation of Speaker Verification System Based on Continuants and Multilayer Perceptrons

  • Lee, Tae-Seung;Park, Sung-Won;Lim, Sang-Seok;Hwang, Byong-Won
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.216-219
    • /
    • 2003
  • Among the techniques to protect private information by adopting biometrics, speaker verification is expected to be widely used due to advantages in convenient usage and inexpensive implementation cost Speaker verification should achieve a high degree of the reliability in the verification nout the flexibility in speech text usage, and the efficiency in verification system complexity. Continuants have excellent speaker-discriminant power and the modest number of phonemes in the category, and multilayer perceptrons (MLPs) have superior recognition ability and fast operation speed. In consequence, the two provide viable ways for speaker verification system to obtain the above properties. This paper implements a system to which continuants and MLPs are applied, and evaluates the system using a Korean speech database. The results of the experiment prove that continuants and MLPs enable the system to acquire the three properties.

  • PDF

Phonetic Transcription Rules and Quantitative Analysis of Phoneme Distribution in French

  • Bae, Hee-Sook;Yun, Young-Sun;Oh, Yung-Hwan
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.149-171
    • /
    • 2002
  • After establishing the rules for the phonetic transcription in French, quantitative analysis on the given text, Waiting for Godot, is performed. Analyzing the text by investigating the influence of phoneme distribution is very interesting in the phonostylistic point of view. Since the phonetic transcription rules are useful for its automation, the rules are carefully established in this paper. From the results of the phonetic transcription, we can investigate the distribution of individual phonemes and the different phoneme groups between dialogues and scenery indications for various characters.

  • PDF

Auditory Images of Japanese /p/ by Koreans (일본어 /p/의 청각인상 연구)

  • Lee, Jae-Kang
    • Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.83-93
    • /
    • 2004
  • The objectives of this study are to analyze Korean speakers' pronunciations of various Japanese /p/ patterns and to provide desirable pronunciation models. This is a part of an ongoing research that aims to propose a useful method of teaching Japanese pronunciation of /p/ to Koreans. The experimental data consist of /p/ phonemes in word initial, word medial, and 'yoon' positions. Yoon must be written in small size after a letter and it only makes a syllable with the preceding letter in Japanese. There were 22 different phoneme positions. They were pronounced by 48 Japanese majoring students (24 females and 24 males), who were in their twenties and were raised in Daejeon and vicinity. The individual pronunciations were collected and digitized into 528 files. The results show that Koreans pronounced the Japanese phoneme /p/ in a variety of ways, according to the auditory environments in which the phoneme was tested: as [ph] in word initial, [pp] or [ph] in word medial, and [ph] in 'yoon', unlike native speakers who pronounced Japanese /p/ as [ph] in word initial, [pp] in word medial and, and [pp] or [ph] in 'yoon'.

  • PDF

Consonant Confusions Matrices in Adults with Dysarthria Associated with Cerebral Palsy (뇌성마비로 인한 마비말장애 성인의 자음 오류 분석)

  • Lee, Youngmee;Sung, JeeEun;Sim, HyunSub
    • Phonetics and Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.47-54
    • /
    • 2013
  • The aim of this study was to analyze consonant articulation errors produced by 90 speakers with cerebral palsy (CP). Phonetic transcriptions were made for 37 single-word utterances containing 70 phonemes: 48 initial consonants and 22 final consonants. Errors of substitution, omission, and distortion were analyzed using a confusion matrix paradigm showing the visualization of error patterns. Results showed that substitution errors in initial and final consonants were most frequent, followed by omission and distortion. Consonant omission occurred more frequently on final consonants. In both initial and final consonants, the within-place errors were more prominent than the within-manner errors. The current results suggest that consonant confusion matrices for dysarthric speech may provide useful information for evaluating speech intelligibility and developing automatic speech recognition system of adults with CP associated dysarthria.