• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.027 seconds

Ambiguity Types of the Homonymic & Heterographic Units for Improving Korean Voice Recognition System - a Preliminary Research (한국어 음성인식 시스템 향상을 위한 동음이철 단위의 중의성 유형 분류)

  • Yoon, Ae-Sun;Kang, Mi-Young
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.67-81
    • /
    • 2008
  • The accuracy rate of P2G (Phoneme-to-Grapheme) is one of the important factors determining the quality of unlimited voice recognition (VR) systems. Few studies were, however, conducted to reduce ambiguities of a phoneme string which can be segmented into a variety of different linguistic units (i.e. morphemes, words, eo-jeols), thus be transformed into more than one grapheme string. This paper is a preliminary research for building a large knowledge base of those homonymic & heterographic units(HHUs), which will provide unlimited Korean VR systems with more accurate P2G information. This paper analyzes 2 main factors generating HHUs: (1) boundary determination of the prosodic unit; (2) its segmentation into linguistic units. In this paper, linguistic characteristics determining variable boundaries of a prosodic unit are investigated, and the ambiguity types of HHUs are classified in accordance with their morphological and syntactic structures as well as with the phonological rules governing them.

  • PDF

Face Recognition by Using Principal Component Anaysis and Fixed-Point Independent Component Analysis (주요성분분석과 고정점 알고리즘 독립성분분석에 의한 얼굴인식)

  • Cho, Yong-Hyun
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.8 no.3
    • /
    • pp.143-148
    • /
    • 2005
  • This paper presents a hybrid method for recognizing the faces by using principal component analysis(PCA) and fixed-point independent component analysis(FP-ICA). PCA is used to whiten the data, which reduces the effects of second-order statistics to the nonlinearities. FP-ICA is applied to extract the statistically independent features of face image. The proposed method has been applied to the problems for recognizing the 20 face images(10 persons * 2 scenes) of 324*243 pixels from Yale face database. The 3 distances such as city-block, Euclidean, negative angle are used as measures when match the probe images to the nearest gallery images. The experimental results show that the proposed method has a superior recognition performances(speed, rate). The negative angle has been relatively achieved more an accurate similarity than city-block or Euclidean.

  • PDF

Exploiting Chaotic Feature Vector for Dynamic Textures Recognition

  • Wang, Yong;Hu, Shiqiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.11
    • /
    • pp.4137-4152
    • /
    • 2014
  • This paper investigates the description ability of chaotic feature vector to dynamic textures. First a chaotic feature and other features are calculated from each pixel intensity series. Then these features are combined to a chaotic feature vector. Therefore a video is modeled as a feature vector matrix. Next by the aid of bag of words framework, we explore the representation ability of the proposed chaotic feature vector. Finally we investigate recognition rate between different combinations of chaotic features. Experimental results show the merit of chaotic feature vector for pixel intensity series representation.

Histogram Equalization Using Background Speakers' Utterances for Speaker Identification (화자 식별에서의 배경화자데이터를 이용한 히스토그램 등화 기법)

  • Kim, Myung-Jae;Yang, Il-Ho;So, Byung-Min;Kim, Min-Seok;Yu, Ha-Jin
    • Phonetics and Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.79-86
    • /
    • 2012
  • In this paper, we propose a novel approach to improve histogram equalization for speaker identification. Our method collects all speech features of UBM training data to make a reference distribution. The ranks of the feature vectors are calculated in the sorted list of the collection of the UBM training data and the test data. We use the ranks to perform order-based histogram equalization. The proposed method improves the accuracy of the speaker recognition system with short utterances. We use four kinds of speech databases to evaluate the proposed speaker recognition system and compare the system with cepstral mean normalization (CMN), mean and variance normalization (MVN), and histogram equalization (HEQ). Our system reduced the relative error rate by 33.3% from the baseline system.

A Study On The Automatic Discrimination Of The Korean Alveolar Stops (한국어 파열음의 자동 인식에 대한 연구 : 한국어 치경 파열음의 자동 분류에 관한 연구)

  • Choi, Yun-Seok;Kim, Ki-Seok;Hwang, Hee-Yeung
    • Proceedings of the KIEE Conference
    • /
    • 1987.11a
    • /
    • pp.330-333
    • /
    • 1987
  • This paper is the study on the automatic discrimination of the Korean alveolar stops. In Korean, it is necessary to discriminate the asperate/tense plosive for the automatic speech recognition system because we, Korean, distinguish asperate/tense plosive allphones from tense and lax plosive. In order to detect acoustic cues for automatic recognition of the [ㄲ, ㄸ, ㅃ], we have experimented the discrimination of [ㄷ,ㄸ,ㅌ]. We used temporal cues like VOT and Silence Duration, etc., and energy cues like ratio of high frequency energy and low frequency energy as the acoustic parameters. The VCV speech data where V is the 8 Simple Vowels and C is the 3 alevolar stops, are used for experiments. The 192 speech data are experimented on and the recognition rate is resulted in about 82%-95%.

  • PDF

Registration Error Compensation for Face Recognition Using Eigenface (Eigenface를 이용한 얼굴인식에서의 영상등록 오차 보정)

  • Moon Ji-Hye;Lee Byung-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.5C
    • /
    • pp.364-370
    • /
    • 2005
  • The first step of face recognition is to align an input face picture with database images. We propose a new algorithm of removing registration error in eigenspace. Our algorithm can correct for translation, rotation and scale changes. Linear matrix modeling of registration error enables us to compensate for subpixel errors in eigenspace. After calculating derivative of a weighting vector in eigenspace we can obtain the amount of translation or rotation without time consuming search. We verify that the correction enhances the recognition rate dramatically.

Detection and Recognition of Vehicle Brake Lights using an R-Filtering (R-필터링을 이용한 자동차 브레이크등 검출과 인식)

  • Jung, Min-Chul
    • Journal of the Semiconductor & Display Technology
    • /
    • v.10 no.4
    • /
    • pp.95-100
    • /
    • 2011
  • This paper proposes a new method of vehicle brake lights detection and recognition using an R-filtering. Firstly, the proposed method processes the R-filtering with the first input image and then with the second one in order to detect brake lights. Secondly, the method counts the number of red pixels and computes the mean value in each R-filtered image. The difference rates between the numbers of the red pixels and between the mean values of two images are defined in this paper. Through the analysis of the difference rates, it can recognize whether brake lights are turned on or off, and whether the vehicle ahead is being approached or not. The proposed method is implemented using C language in an embedded Linux system for a high-speed real-time image processing. Experiment results show that the proposed algorithm is quite successful.

CHMM Modeling using LMS Algorithm for Continuous Speech Recognition Improvement (연속 음성 인식 향상을 위해 LMS 알고리즘을 이용한 CHMM 모델링)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.10 no.11
    • /
    • pp.377-382
    • /
    • 2012
  • In this paper, the echo noise robust CHMM learning model using echo cancellation average estimator LMS algorithm is proposed. To be able to adapt to the changing echo noise. For improving the performance of a continuous speech recognition, CHMM models were constructed using echo noise cancellation average estimator LMS algorithm. As a results, SNR of speech obtained by removing Changing environment noise is improved as average 1.93dB, recognition rate improved as 2.1%.

Research of Gesture Recognition Technology Based on GMM and SVM Hybrid Model Using EPIC Sensor (EPIC 센서를 이용한 GMM, SVM 기반 동작인식기법에 관한 연구)

  • CHEN, CUI;Kim, Young-Chul
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2016.05a
    • /
    • pp.11-12
    • /
    • 2016
  • SVM (Support Vector machine) is powerful machine-learning method, and obtains better performance than traditional methods in the applications of muti-dimension nonlinear pattern classification. For the case of SVM model training and low efficiency in large samples, this paper proposes a combination of statistical parameters of the GMM-UBM (Universal Background Model) model. It is very effective to solve the problem of the large sample for the SVM training. The experiment is carried on four special dynamic hand gestures using the EPIC sensors. And the results show that the improved dynamic hand gesture recognition system has a high recognition rate up to 96.75%.

  • PDF

A Study on Korean Connected Digit Recognizer Based on Semi-syllable and Post-processing (반음절기반의 한국어 연속숫자음인식과 그 후처리에 대한 연구)

  • Jeong, Jae-Boo;Chung, Hoon;Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.1-15
    • /
    • 2001
  • This paper describes the effect of new recognition unit, a unit based on semisyllable, and its post processing method. A recognition unit based on semi-syllable expresses Korean connected digit's coarticulation effect. An existing method using semi-syllable limits next models, derived from current recognized models, to make complete connected digit sequence. However, this paper uses a new method to make complete connected digit sequence. The new post-processing method recognizes isolated digit words which include digits sequence from the digit combinations being able to occur from current recognized semi-syllable sequence. This method gives an improved accuracy rate than that of existing method. This new post processing provides two advantages. 1) It corrects current mis-recognized semi-syllable unit. 2) When people say each digit, they say it without regard to saying duration.

  • PDF