• Title/Summary/Keyword: vowel recognition

Search Result 138, Processing Time 0.024 seconds

An Implementation of Crossward Game using Speech Recognition and Synthesis System (음성인식 및 합성을 이용한 십자말 게임의 구현)

  • Kim Dong-Ju;Youn Jeh-Seon;Lee Young-Ju;Kim Dong-Hwan;Hong Kwang-Seok
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.29-32
    • /
    • 2001
  • 본 논문에서는 연구실에서 만든 음성인식기와 합성기를 이용하여 십자말 게임을 구현하였다. 십자말 게임에는 고사성어 600개 정도의 단어가 사용되었으며, 다른 영역별 사전을 추가 할 수 있도록 만들어졌다. 구현된 게임은 시작, 진행 등의 모든 과정이 음성으로 동작하며, 부과적인 정보는 음성 합성(TTS)에 의해 이루어진다. 십자말 게임에 사용되는 단어의 배열은 매번 랜덤하게 선택되도록 구성되며, 음성 인식기는 VCCV (Vowel + Consonant + Consonant + Vowel) 기반의 화자독립으로 구현되었다. 선택된 문제에 대한 설명은 텍스트로 보여주면서, 동시에 TTS 시스템에 의해 음성으로 출력된다.

  • PDF

An Implementation of Word Relay Game using Speech Recognition (음성인식 끝말 이어가기 게임의 구현)

  • 김동환;윤재선;홍광석
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2000.12a
    • /
    • pp.177-180
    • /
    • 2000
  • 최근에 음성인식의 상용화가 급격히 추진되고 있다. 그러나 음성인식 응용제품의 부족과 음성인식 시스템의 성능문제로 인하여 일반인의 이용은 그다지 많지 않다. 본 논문에서는 연구실에서 만든 가변 어휘 음성인식기를 이용하여 음성인식 끝말 이어가기 게임을 구현하였다. 가변어휘 음성 인식기는 VCCV(Vowel+consonant+Consonant+vowel) 기반의 화자독립으로 구현하였다. 끝말 이어가기 게임을 위해서 약 500만 어절이 포함된 문장에서 추출한 단어의 일부를 이용하여 사전을 구축하였고, 같은 음절로 시작하는 단어가 많은 경우에는 그 수를 제안하였다. 본 연구에서 구현한 음성인식 끝말 이어가기 게임은 제한된 단어사전을 이용하도록 하였으나 음성인식기의 성능향상과 완전한 사전구축이 이루어지면 음성인식을 이용한 언어 학습기나 게임 등의 개발과 이용의 활성화에 크게 기여할 것이라 생각된다.

  • PDF

Lip-Synch System Optimization Using Class Dependent SCHMM (클래스 종속 반연속 HMM을 이용한 립싱크 시스템 최적화)

  • Lee, Sung-Hee;Park, Jun-Ho;Ko, Han-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.312-318
    • /
    • 2006
  • The conventional lip-synch system has a two-step process, speech segmentation and recognition. However, the difficulty of speech segmentation procedure and the inaccuracy of training data set due to the segmentation lead to a significant Performance degradation in the system. To cope with that, the connected vowel recognition method using Head-Body-Tail (HBT) model is proposed. The HBT model which is appropriate for handling relatively small sized vocabulary tasks reflects co-articulation effect efficiently. Moreover the 7 vowels are merged into 3 classes having similar lip shape while the system is optimized by employing a class dependent SCHMM structure. Additionally in both end sides of each word which has large variations, 8 components Gaussian mixture model is directly used to improve the ability of representation. Though the proposed method reveals similar performance with respect to the CHMM based on the HBT structure. the number of parameters is reduced by 33.92%. This reduction makes it a computationally efficient method enabling real time operation.

Korean LVCSR for Broadcast News Speech

  • Lee, Gang-Seong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2E
    • /
    • pp.3-8
    • /
    • 2001
  • In this paper, we will examine a Korean large vocabulary continuous speech recognition (LVCSR) system for broadcast news speech. The combined vowel and implosive unit is included in a phone set together with other short phone units in order to obtain a longer unit acoustic model. The effect of this unit is compared with conventional phone units. The dictionary units for language processing are automatically extracted from eojeols appearing in transcriptions. Triphone models are used for acoustic modeling and a trigram model is used for language modeling. Among three major speaker groups in news broadcasts-anchors, journalists and people (those other than anchors or journalists, who are being interviewed), the speech of anchors and journalists, which has a lot of noise, was used for testing and recognition.

  • PDF

Online korean character recognition using letter spotting method (자소 탐색 방법에 의한 온라인 한글 필기 인식)

  • 조범준
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.6
    • /
    • pp.1379-1389
    • /
    • 1996
  • Hangul character always consists of consonants-vowel-consonants in order. Using this point, this paper proposes an approach to design a model for spotting each letter in Hangul, and then recognize characters based on the spotting results. The network model consist of a set of HMMs. The letter search is carried out by Viterbi algorithm, while character recognition is performed by searching the lattice of letter hypotheses. Experimental results show that, in spite of simple architecture of recognition, the performance is quite high reaching 87.47% for discrete regular characters. In particular the approach shows highly plausible segmentation of letters in characters.

  • PDF

A Study on the Highly Accurate Korean Character Recognition Algorithm, by analyzing Vowel and Consonant Models - Selectiong of candidates using pattern matching method and discriminating similar characters by structural analysis - (자. 모 해석적 모델에 의한 고정도 한글 인식 알고리즘에 관한 연구 - 패턴정합법에 기초한 후보문자 선정 및 구조해석적인 방법에 의한 유사문자 판별 -)

  • 강선미;김봉석;김덕진
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.7
    • /
    • pp.24-30
    • /
    • 1993
  • In this paper, a new method is proposed to recognize a character from its similar characters, which are selected by pattern matching method in Korean character recognition. This new method, which couples the merits of already suggested methods, can choose the character to be in the candidate set and discriminate it from the others correctly. To evaluate performance of this algorithm, we used 15 kinds of different laser printer fonts and obtained about 97% of recognition rate.

  • PDF

Monophone and Biphone Compuond Unit for Korean Vocabulary Speech Recognition (한국어 어휘 인식을 위한 혼합형 음성 인식 단위)

  • 이기정;이상운;홍재근
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.6
    • /
    • pp.867-874
    • /
    • 2001
  • In this paper, considering the pronunciation characteristic of Korean, recognition units which can shorten the recognition time and reflect the coarticulation effect simultaneously are suggested. These units are composed of monophone and hipbone ones. Monophone units are applied to the vowels which represent stable characteristic. Biphones are used to the consonant which vary according to adjacent vowel. In the experiment of word recognition of PBW445 database, the compound units result in comparable recognition accuracy with 57% speed up compared with triphone units and better recognition accuracy with similar speed. In addition, we can reduce the memory size because of fewer units.

  • PDF

Monosyllable Speech Recognition through Facial Movement Analysis (안면 움직임 분석을 통한 단음절 음성인식)

  • Kang, Dong-Won;Seo, Jeong-Woo;Choi, Jin-Seung;Choi, Jae-Bong;Tack, Gye-Rae
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.6
    • /
    • pp.813-819
    • /
    • 2014
  • The purpose of this study was to extract accurate parameters of facial movement features using 3-D motion capture system in speech recognition technology through lip-reading. Instead of using the features obtained through traditional camera image, the 3-D motion system was used to obtain quantitative data for actual facial movements, and to analyze 11 variables that exhibit particular patterns such as nose, lip, jaw and cheek movements in monosyllable vocalizations. Fourteen subjects, all in 20s of age, were asked to vocalize 11 types of Korean vowel monosyllables for three times with 36 reflective markers on their faces. The obtained facial movement data were then calculated into 11 parameters and presented as patterns for each monosyllable vocalization. The parameter patterns were performed through learning and recognizing process for each monosyllable with speech recognition algorithms with Hidden Markov Model (HMM) and Viterbi algorithm. The accuracy rate of 11 monosyllables recognition was 97.2%, which suggests the possibility of voice recognition of Korean language through quantitative facial movement analysis.

EEG based Vowel Feature Extraction for Speech Recognition System using International Phonetic Alphabet (EEG기반 언어 인식 시스템을 위한 국제음성기호를 이용한 모음 특징 추출 연구)

  • Lee, Tae-Ju;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.1
    • /
    • pp.90-95
    • /
    • 2014
  • The researchs using brain-computer interface, the new interface system which connect human to macine, have been maded to implement the user-assistance devices for control of wheelchairs or input the characters. In recent researches, there are several trials to implement the speech recognitions system based on the brain wave and attempt to silent communication. In this paper, we studied how to extract features of vowel based on international phonetic alphabet (IPA), as a foundation step for implementing of speech recognition system based on electroencephalogram (EEG). We conducted the 2 step experiments with three healthy male subjects, and first step was speaking imagery with single vowel and second step was imagery with successive two vowels. We selected 32 channels, which include frontal lobe related to thinking and temporal lobe related to speech function, among acquired 64 channels. Eigen value of the signal was used for feature vector and support vector machine (SVM) was used for classification. As a result of first step, we should use over than 10th order of feature vector to analyze the EEG signal of speech and if we used 11th order feature vector, the highest average classification rate was 95.63 % in classification between /a/ and /o/, the lowest average classification rate was 86.85 % with /a/ and /u/. In the second step of the experiments, we studied the difference of speech imaginary signals between single and successive two vowels.

Comparison of Feature Performance in Off-line Hanwritten Korean Alphabet Recognition (오프라인 필기체 한글 자소 인식에 있어서 특징성능의 비교)

  • Ko, Tae-Seog;Kim, Jong-Ryeol;Chung, Kyu-Sik
    • Korean Journal of Cognitive Science
    • /
    • v.7 no.1
    • /
    • pp.57-74
    • /
    • 1996
  • This paper presents a comparison of recognition performance of the features used inthe recent handwritten korean character recognition.This research aims at providing the basis for feature selecion in order to improve not only the recognition rate but also the efficiency of recognition system.For the comparison of feature performace,we analyzed the characteristics of theose features and then,classified them into three rypes:global feature(image transformation)type,statistical feature type,and local/ topological feature type.For each type,we selected four or five features which seem more suitable to represent the characteristics of korean alphabet,and performed recongition experiments for the first consonant,horizontal vowel,and vertical vowel of a korean character, respectively.The classifier used in our experiments is a multi-layered perceptron with one hidden layer which is trained with backpropagation algorithm.The training and test data in the experiment are taken from 30sets of PE92. Experimental results show that 1)local/topological features outperform the other two type features in terms of recognition rates 2)mesh and projection features in statical feature type,walsh and DCT features in global feature type,and gradient and concavity features in local/topological feature type outperform the others in each type, respectively.

  • PDF