• Title, Summary, Keyword: recognition

Search Result 18,526, Processing Time 0.062 seconds

A Study on the Optimal Mahalanobis Distance for Speech Recognition

  • Lee, Chang-Young
    • Speech Sciences
    • /
    • v.13 no.4
    • /
    • pp.177-186
    • /
    • 2006
  • In an effort to enhance the quality of feature vector classification and thereby reduce the recognition error rate of the speaker-independent speech recognition, we employ the Mahalanobis distance in the calculation of the similarity measure between feature vectors. It is assumed that the metric matrix of the Mahalanobis distance be diagonal for the sake of cost reduction in memory and time of calculation. We propose that the diagonal elements be given in terms of the variations of the feature vector components. Geometrically, this prescription tends to redistribute the set of data in the shape of a hypersphere in the feature vector space. The idea is applied to the speech recognition by hidden Markov model with fuzzy vector quantization. The result shows that the recognition is improved by an appropriate choice of the relevant adjustable parameter. The Viterbi score difference of the two winners in the recognition test shows that the general behavior is in accord with that of the recognition error rate.

  • PDF

Performance Comparison of the Speech Enhancement Methods for Noisy Speech Recognition (잡음음성인식을 위한 음성개선 방식들의 성능 비교)

  • Chung, Yong-Joo
    • Phonetics and Speech Sciences
    • /
    • v.1 no.2
    • /
    • pp.9-14
    • /
    • 2009
  • Speech enhancement methods can be generally classified into a few categories and they have been usually compared with each other in terms of speech quality. For the successful use of speech enhancement methods in speech recognition systems, performance comparisons in terms of speech recognition accuracy are necessary. In this paper, we compared the speech recognition performance of some of the representative speech enhancement algorithms which are popularly cited in the literature and used widely. We also compared the performance of speech enhancement methods with other noise robust speech recognition methods like PMC to verify the usefulness of speech enhancement approaches in noise robust speech recognition systems.

  • PDF

Smart Phone Road Signs Recognition Model Using Image Segmentation Algorithm

  • Huang, Ying;Song, Jeong-Young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • /
    • pp.887-890
    • /
    • 2012
  • Image recognition is one of the most important research directions of pattern recognition. Image based road automatic identification technology is widely used in current society, the intelligence has become the trend of the times. This paper studied the image segmentation algorithm theory and its application in road signs recognition system. With the help of image processing technique, respectively, on road signs automatic recognition algorithm of three main parts, namely, image segmentation, character segmentation, image and character recognition, made a systematic study and algorithm. The experimental results show that: the image segmentation algorithm to establish road signs recognition model, can make effective use of smart phone system and application.

  • PDF

Face Recognition Based on PCA on Wavelet Subband of Average-Half-Face

  • Satone, M.P.;Kharate, G.K.
    • Journal of Information Processing Systems
    • /
    • v.8 no.3
    • /
    • pp.483-494
    • /
    • 2012
  • Many recent events, such as terrorist attacks, exposed defects in most sophisticated security systems. Therefore, it is necessary to improve security data systems based on the body or behavioral characteristics, often called biometrics. Together with the growing interest in the development of human and computer interface and biometric identification, human face recognition has become an active research area. Face recognition appears to offer several advantages over other biometric methods. Nowadays, Principal Component Analysis (PCA) has been widely adopted for the face recognition algorithm. Yet still, PCA has limitations such as poor discriminatory power and large computational load. This paper proposes a novel algorithm for face recognition using a mid band frequency component of partial information which is used for PCA representation. Because the human face has even symmetry, half of a face is sufficient for face recognition. This partial information saves storage and computation time. In comparison with the traditional use of PCA, the proposed method gives better recognition accuracy and discriminatory power. Furthermore, the proposed method reduces the computational load and storage significantly.

Real-Time Implementation of Wireless Remote Control of Mobile Robot Based-on Speech Recognition Command (음성명령에 의한 모바일로봇의 실시간 무선원격 제어 실현)

  • Shim, Byoung-Kyun;Han, Sung-Hyun
    • Journal of The Korean Society of Manufacturing Technology Engineers
    • /
    • v.20 no.2
    • /
    • pp.207-213
    • /
    • 2011
  • In this paper, we present a study on the real-time implementation of mobile robot to which the interactive voice recognition technique is applied. The speech command utters the sentential connected word and asserted through the wireless remote control system. We implement an automatic distance speech command recognition system for voice-enabled services interactively. We construct a baseline automatic speech command recognition system, where acoustic models are trained from speech utterances spoken by a microphone. In order to improve the performance of the baseline automatic speech recognition system, the acoustic models are adapted to adjust the spectral characteristics of speech according to different microphones and the environmental mismatches between cross talking and distance speech. We illustrate the performance of the developed speech recognition system by experiments. As a result, it is illustrated that the average rates of proposed speech recognition system shows about 95% above.

Multimodal Face Biometrics by Using Convolutional Neural Networks

  • Tiong, Leslie Ching Ow;Kim, Seong Tae;Ro, Yong Man
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.2
    • /
    • pp.170-178
    • /
    • 2017
  • Biometric recognition is one of the major challenging topics which needs high performance of recognition accuracy. Most of existing methods rely on a single source of biometric to achieve recognition. The recognition accuracy in biometrics is affected by the variability of effects, including illumination and appearance variations. In this paper, we propose a new multimodal biometrics recognition using convolutional neural network. We focus on multimodal biometrics from face and periocular regions. Through experiments, we have demonstrated that facial multimodal biometrics features deep learning framework is helpful for achieving high recognition performance.

Implementation of Human and Computer Interface for Detecting Human Emotion Using Neural Network (인간의 감정 인식을 위한 신경회로망 기반의 휴먼과 컴퓨터 인터페이스 구현)

  • Cho, Ki-Ho;Choi, Ho-Jin;Jung, Seul
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.9
    • /
    • pp.825-831
    • /
    • 2007
  • In this paper, an interface between a human and a computer is presented. The human and computer interface(HCI) serves as another area of human and machine interfaces. Methods for the HCI we used are voice recognition and image recognition for detecting human's emotional feelings. The idea is that the computer can recognize the present emotional state of the human operator, and amuses him/her in various ways such as turning on musics, searching webs, and talking. For the image recognition process, the human face is captured, and eye and mouth are selected from the facial image for recognition. To train images of the mouth, we use the Hopfield Net. The results show 88%$\sim$92% recognition of the emotion. For the vocal recognition, neural network shows 80%$\sim$98% recognition of voice.

The Pattern Recognition Methods for Emotion Recognition with Speech Signal (음성신호를 이용한 감성인식에서의 패턴인식 방법)

  • Park Chang-Hyun;Sim Kwee-Bo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.3
    • /
    • pp.284-288
    • /
    • 2006
  • In this paper, we apply several pattern recognition algorithms to emotion recognition system with speech signal and compare the results. Firstly, we need emotional speech databases. Also, speech features for emotion recognition is determined on the database analysis step. Secondly, recognition algorithms are applied to these speech features. The algorithms we try are artificial neural network, Bayesian learning, Principal Component Analysis, LBG algorithm. Thereafter, the performance gap of these methods is presented on the experiment result section. Truly, emotion recognition technique is not mature. That is, the emotion feature selection, relevant classification method selection, all these problems are disputable. So, we wish this paper to be a reference for the disputes.

Music Recognition Using Audio Fingerprint: A Survey (오디오 Fingerprint를 이용한 음악인식 연구 동향)

  • Lee, Dong-Hyun;Lim, Min-Kyu;Kim, Ji-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.77-87
    • /
    • 2012
  • Interest in music recognition has been growing dramatically after NHN and Daum released their mobile applications for music recognition in 2010. Methods in music recognition based on audio analysis fall into two categories: music recognition using audio fingerprint and Query-by-Singing/Humming (QBSH). While music recognition using audio fingerprint receives music as its input, QBSH involves taking a user-hummed melody. In this paper, research trends are described for music recognition using audio fingerprint, focusing on two methods: one based on fingerprint generation using energy difference between consecutive bands and the other based on hash key generation between peak points. Details presented in the representative papers of each method are introduced.

Recognition of English Calling Cards by Using Projection Method and Enhanced RBE Network

  • Kim, Kwang-Baek
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.474-479
    • /
    • 2003
  • In this paper, we proposed the novel method for the recognition of English calling cards by using the projection method and the enhanced RBF (Radial Basis Function) network. The recognition of calling cards consists of the extraction phase of character areas and the recognition phase of extracted characters. In the extraction phase, first of all, noises are removed from the images of calling cards, and the feature areas including character strings are separated from the calling card images by using the horizontal smearing method and the 8-directional contour tracking method. And using the image projection method, the feature areas are split into the areas of individual characters. We also proposed the enhanced RBF network that organizes the middle layer effectively by using the enhanced ART1 neural network adjusting the vigilance threshold dynamically according to the homogeneity between patterns. In the recognition phase, the proposed neural network is applied to recognize individual characters. Our experiment result showed that the proposed recognition algorithm has higher success rate of recognition and faster learning time than the existing neural network based recognition.