• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.03 seconds

A Facial Feature Area Extraction Method for Improving Face Recognition Rate in Camera Image (일반 카메라 영상에서의 얼굴 인식률 향상을 위한 얼굴 특징 영역 추출 방법)

  • Kim, Seong-Hoon;Han, Gi-Tae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.5
    • /
    • pp.251-260
    • /
    • 2016
  • Face recognition is a technology to extract feature from a facial image, learn the features through various algorithms, and recognize a person by comparing the learned data with feature of a new facial image. Especially, in order to improve the rate of face recognition, face recognition requires various processing methods. In the training stage of face recognition, feature should be extracted from a facial image. As for the existing method of extracting facial feature, linear discriminant analysis (LDA) is being mainly used. The LDA method is to express a facial image with dots on the high-dimensional space, and extract facial feature to distinguish a person by analyzing the class information and the distribution of dots. As the position of a dot is determined by pixel values of a facial image on the high-dimensional space, if unnecessary areas or frequently changing areas are included on a facial image, incorrect facial feature could be extracted by LDA. Especially, if a camera image is used for face recognition, the size of a face could vary with the distance between the face and the camera, deteriorating the rate of face recognition. Thus, in order to solve this problem, this paper detected a facial area by using a camera, removed unnecessary areas using the facial feature area calculated via a Gabor filter, and normalized the size of the facial area. Facial feature were extracted through LDA using the normalized facial image and were learned through the artificial neural network for face recognition. As a result, it was possible to improve the rate of face recognition by approx. 13% compared to the existing face recognition method including unnecessary areas.

A Study on the Automatic Recognition of Korean Basic Spoken Digit Using Energy of Special Bandwidth (특정 대역 에너지를 이용한 한국어 기본 수자 음성의 백동 인식에 관한 연구)

  • Han, Hee;Kim, Soon-Hyob;Park, Kyu-Tae
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.19 no.3
    • /
    • pp.5-12
    • /
    • 1982
  • Through the use of energy ratio of special bandwidths of basic vowels, recognition of Korean basic spoken digit is performed in logical combination with a zero-crossing rate and an energy parameter. In the experiments for recognition of the digits, the speech signal of spoken digits is filtered by a lowpass filter of which the cutoff frequency is 10KHz, and then sampled at 20KHz of sampling rate, In the speech signal processing, we used four FIR digital filters, and the order of filter lengths is 61, 120, 25, 25respectively. The filters are designed by using Remetz exchange algorithm.[13],[14] As a result, the recognition rate of 92% for the three speakers is obstained.

  • PDF

A Study on VQ/HMM using Nonlinear Clustering and Smoothing Method (비선형 집단화와 완화기법을 이용한 VQ/HMM에 관한 연구)

  • 정희석;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.3
    • /
    • pp.35-42
    • /
    • 1999
  • In this paper, a modified clustering algorithm is proposed to improve the discrimination of discrete HMM(Hidden Markov Model), so that it has increased recognition rate of 2.16% in comparison with the original HMM using the K-means or LBG algorithm. And, for preventing the decrease of recognition rate because of insufficient training data at the training scheme of HMM, a modified probabilistic smoothing method is proposed, which has increased recognition rate of 3.07% for the speaker-independent case. In the experiment applied the two proposed algorithms, the average rate of recognition has increased 4.66% for the speaker-independent case in comparison with that of original VQ/HMM.

  • PDF

An Enhanced Text-Prompt Speaker Recognition Using DTW (DTW를 이용한 향상된 문맥 제시형 화자인식)

  • 신유식;서광석;김종교
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1
    • /
    • pp.86-91
    • /
    • 1999
  • This paper presents the text-prompt method to overcome the weakness of text-dependent and text-independent speaker recognition. Enhanced dynamic time warping for speaker recognition algorithm is applied. For the real-time processing, we use a simple algorithm for end-point detection without increasing computational complexity. The test shows that the weighted-cepstrum is most proper for speaker recognition among various speech parameters. As the experimental results of the proposed algorithm for three prompt words, the speaker identification error rate is 0.02%, and when the threshold is set properly, false rejection rate is 1.89%, false acceptance rate is 0.77% and verification total error rate is 0.97% for speaker verification.

  • PDF

An Implementation of Security System Using Speaker Recognition Algorithm (화자인식 알고리즘을 이용한 보안 시스템 구축)

  • Shin, You-Shik;Park, Kee-Young;Kim, Chong-Kyo
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.36T no.4
    • /
    • pp.17-23
    • /
    • 1999
  • This paper described a security system using text-independent speaker recognition algorithm. Security system is based on PIC16F84 and sound card. Speaker recognition algorithm applied a k-means based model and weighted cepstrum for speech features. As the experimental results, recognition rate of the training data is 100%, non-training data is 99%. Also false rejection rate is 1%, false acceptance rate is 0% and verification mean error rate is 0.5% for registered 5 persons.

  • PDF

CROSS-LANGUAGE SPEECH PERCEPTION BY KOREAN AND POLISH.

  • Paradowska, Anna
    • Proceedings of the KSPS conference
    • /
    • 2000.07a
    • /
    • pp.178-178
    • /
    • 2000
  • This paper IS concerned with adults' foreign language aquisition and intends to research the relationship between the mother tongue's phonetic system (L1) and the perception of the foreign language (L2), in this paper Polish and Korean. The questions that are to help to define the aforementioned relationship are I) how Polish perceive Korean vowels, 2) how Koreans perceive Polish vowels, and 3) how Koreans perceive Korean vowels pronounced by Poles. In order to identify L2's vowels, the listeners try to fit them into the categories of their own language (L1). On the one hand, vowels that are the same in both languages and those that are articulated where no other vowel is articulated, have the best rate of recognition. For example, /i/ in both languages is a front close vowel and in both languages there are no other front close vowels. Therefore, vowels /i/ (and /a/) have the best rate of recognition in all three experiments. On the other hand, vowels that are unfamiliar to the listeners do not seem to have the worst rate of recognition. The vowels that have the worst rate of recognition are those, that are similar, but not quite the same as those of L1. This research proves that "equivalence classification prevents L2 learners from producing similar L2 phones, but not new L2 phones, authentically" (Flege, 1987). Polish speakers can pronounce unfamiliar L2 vowels "more authentically" than those similar to L1 vowels. However, the difference is not significant and this subject requires further research (different data, more informants).

  • PDF

Development of Recognition and Reaction Time Prediction Model in Road Signs using Negative Binomial Regression (음이항회귀식을 이용한 도로표지의 인지반응시간 추정모형 개발)

  • Park, Hyung-Jin;Lee, Ki-Young;Kim, Jung-Young
    • Journal of the Ergonomics Society of Korea
    • /
    • v.25 no.4
    • /
    • pp.23-33
    • /
    • 2006
  • The purpose of this study is to determine the economical standard of road signs by verifying the difference of driver's recognition and reaction time according to the space rate of letters on the road signs. For this reason, indoor simulations was conducted to confirm difference of recognition and reaction time on six sign-targets having different space rate. Also, a negative binomial regression model was used to find the main factors which could lower the rate of misreading. For this model, increasing of legibility of sign is not only simple enlargement of sign, but also suitable match of letters and sign. The result of this study is capable of verifying the importance of the space rate in road signs, and being utilized as a effective method to determine the standard of the road signs.

The research on the MEMS device improvement which is necessary for the noise environment in the speech recognition rate improvement (잡음 환경에서 음성 인식률 향상에 필요한 MEMS 장치 개발에 관한 연구)

  • Yang, Ki-Woong;Lee, Hyung-keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.12
    • /
    • pp.1659-1666
    • /
    • 2018
  • When the input sound is mixed voice and sound, it can be seen that the voice recognition rate is lowered due to the noise, and the speech recognition rate is improved by improving the MEMS device which is the H / W device in order to overcome the S/W processing limit. The MEMS microphone device is a device for inputting voice and is implemented in various shapes and used. Conventional MEMS microphones generally exhibit excellent performance, but in a special environment such as noise, there is a problem that the processing performance is deteriorated due to a mixture of voice and sound. To overcome these problems, we developed a newly designed MEMS device that can detect the voice characteristics of the initial input device.

Oral health education on recognition and their prevalence of dental caries comparative analysis of some primary school pupils' in Buckjeju-gun (북제주군 일부 초등학교 학동들의 구강보건교육에 관련된 인식도 및 영구치 우식경험도 비교 평가)

  • Kim, Youn-Hwa
    • Journal of Korean society of Dental Hygiene
    • /
    • v.2 no.1
    • /
    • pp.1-19
    • /
    • 2002
  • This study has been conducted with continuous dental sanitary education for primary school pupils for five years from 1997 through 2001, based on data obtained from a 97' survey on primary school pupils' recognition on dental hygiene education and their permanent dental health capacity. Following results were drawn through comparative analysis of data obtained during the survey period. Approx 70.77% of the examinees have experienced decay missing feeling (DMF) in the year 2001, suggesting a good effectiveness of dental hygiene education compared with 92.1 % of DMF rate in 1997. It has been found that pupils' knowledge and recognition on dental hygiene and management, etc were improved, as well as their eating habits and consciousness were changed. Comparative analysis of annual DMF showed that DMF rate, DMFT index, and DT rate were found to decrease every year, suggesting a improved dental health capacity. Grade level analysis revealed that DMFT index and DT rate were found to decrease every year during the survey period, suggesting pupils' dental management and consciousness were improved and changed. It has been found that DMF rate more significantly increased in a higher grade in 2001 than 1997. There was no difference in DMF rate between grades of primary school in 1997. However, in the year 2001 increment of approx 10% of DMF rate were observed in a higher grade.

  • PDF

Noise Elimination Using Improved MFCC and Gaussian Noise Deviation Estimation

  • Sang-Yeob, Oh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.1
    • /
    • pp.87-92
    • /
    • 2023
  • With the continuous development of the speech recognition system, the recognition rate for speech has developed rapidly, but it has a disadvantage in that it cannot accurately recognize the voice due to the noise generated by mixing various voices with the noise in the use environment. In order to increase the vocabulary recognition rate when processing speech with environmental noise, noise must be removed. Even in the existing HMM, CHMM, GMM, and DNN applied with AI models, unexpected noise occurs or quantization noise is basically added to the digital signal. When this happens, the source signal is altered or corrupted, which lowers the recognition rate. To solve this problem, each voice In order to efficiently extract the features of the speech signal for the frame, the MFCC was improved and processed. To remove the noise from the speech signal, the noise removal method using the Gaussian model applied noise deviation estimation was improved and applied. The performance evaluation of the proposed model was processed using a cross-correlation coefficient to evaluate the accuracy of speech. As a result of evaluating the recognition rate of the proposed method, it was confirmed that the difference in the average value of the correlation coefficient was improved by 0.53 dB.