Search | Korea Science

Multi-layer Speech Processing System for Point-Of-Interest Recognition in the Car Navigation System (차량용 항법장치에서의 관심지 인식을 위한 다단계 음성 처리 시스템)

Bhang, Ki-Duck;Kang, Chul-Ho
- Journal of Korea Multimedia Society
- /
- v.12 no.1
- /
- pp.16-25
- /
- 2009
In the car environment that the first priority is a safety problem, the large vocabulary isolated word recognition system with POI domain is required as the optimal HMI technique. For the telematics terminal with a highly limited processing time and memory capacity, it is impossible to process more than 100,000 words in the terminal by the general speech recognition methods. Therefore, we proposed phoneme recognizer using the phonetic GMM and also PDM Levenshtein distance with multi-layer architecture for the POI recognition of telematics terminal. By the proposed methods, we obtained high performance in the telematics terminal with low speed processing and small memory capacity. we obtained the recognition rate of maximum 94.8% in indoor environment and of maximum 92.4% in the car navigation environments.
PDF

A study on object recognition using morphological shape decomposition

Ahn, Chang-Sun;Eum, Kyoung-Bae
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 1999.05a
- /
- pp.185-191
- /
- 1999
Mathematical morphology based on set theory has been applied to various areas in image processing. Pitas proposed a object recognition algorithm using Morphological Shape Decomposition(MSD), and a new representation scheme called Morphological Shape Representation(MSR). The Pitas's algorithm is a simple and adequate approach to recognize objects that are rotated 45 degree-units with respect to the model object. However, this recognition scheme fails in case of random rotation. This disadvantage may be compensated by defining small angle increments. However, this solution may greatly increase computational complexity because the smaller the step makes more number of rotations to be necessary. In this paper, we propose a new method for object recognition based on MSD. The first step of our method decomposes a binary shape into a union of simple binary shapes, and then a new tree structure is constructed which ran represent the relations of binary shapes in an object. finally, we obtain the feature informations invariant to the rotation, translation, and scaling from the tree and calculate matching scores using efficient matching measure. Because our method does not need to rotate the object to be tested, it could be more efficient than Pitas's one. MSR has an intricate structure so that it might be difficult to calculate matching scores even for a little complex object. But our tree has simpler structure than MSR, and easier to calculated the matchng score. We experimented 20 test images scaled, rotated, and translated versions of five kinds of automobile images. The simulation result using octagonal structure elements shows 95% correct recognition rate. The experimental results using approximated circular structure elements are examined. Also, the effect of noise on MSR scheme is considered.
PDF

LSTM RNN-based Korean Speech Recognition System Using CTC (CTC를 이용한 LSTM RNN 기반 한국어 음성인식 시스템)

Lee, Donghyun;Lim, Minkyu;Park, Hosung;Kim, Ji-Hwan
- Journal of Digital Contents Society
- /
- v.18 no.1
- /
- pp.93-99
- /
- 2017
A hybrid approach using Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) has showed great improvement in speech recognition accuracy. For training acoustic model based on hybrid approach, it requires forced alignment of HMM state sequence from Gaussian Mixture Model (GMM)-Hidden Markov Model (HMM). However, high computation time for training GMM-HMM is required. This paper proposes an end-to-end approach for LSTM RNN-based Korean speech recognition to improve learning speed. A Connectionist Temporal Classification (CTC) algorithm is proposed to implement this approach. The proposed method showed almost equal performance in recognition rate, while the learning speed is 1.27 times faster.
https://doi.org/10.9728/dcs.2017.18.1.93 인용 PDF KSCI

A Fast Recognition System of Gothic-Hangul using the Contour Tracing (윤곽선 추적에 의한 고딕체 한글의 신속인식에 관한 연구)

정주성;김춘석;박충규
- The Transactions of the Korean Institute of Electrical Engineers
- /
- v.37 no.8
- /
- pp.579-587
- /
- 1988
Conventional methods of automatic recognition of Korean characters consist of the thinning processing, the segmentation of connected fundamental phonemes and the recognition of each fundamental character. These methods, however require the thinning processing which is complex and time consuming. Also several noise components make worse effects on the recognition of characters than in the case of no thinning. This paper describes the extraction method of the feature components of Korean fundamental characters of the Gothic Korean letter without the thinning. We regard line-components of the contour which describes the character's external boundary as the feature-components. The line-component includes the directional code, the length and the start point in the image. Each fundamental character is represented by the string of directional codes. Therefore the recognition process is only the string pattern matching. We use the Gothic-hangul in the experiment. The ecognition rate is 92%.

Facial Recognition Algorithm Based on Edge Detection and Discrete Wavelet Transform

Chang, Min-Hyuk;Oh, Mi-Suk;Lim, Chun-Hwan;Ahmad, Muhammad-Bilal;Park, Jong-An
- Transactions on Control, Automation and Systems Engineering
- /
- v.3 no.4
- /
- pp.283-288
- /
- 2001
In this paper, we proposed a method for extracting facial characteristics of human being in an image. Given a pair of gray level sample images taken with and without human being, the face of human being is segmented from the image. Noise in the input images is removed with the help of Gaussian filters. Edge maps are found of the two input images. The binary edge differential image is obtained from the difference of the two input edge maps. A mask for face detection is made from the process of erosion followed by dilation on the resulting binary edge differential image. This mask is used to extract the human being from the two input image sequences. Features of face are extracted from the segmented image. An effective recognition system using the discrete wave let transform (DWT) is used for recognition. For extracting the facial features, such as eyebrows, eyes, nose and mouth, edge detector is applied on the segmented face image. The area of eye and the center of face are found from horizontal and vertical components of the edge map of the segmented image. other facial features are obtained from edge information of the image. The characteristic vectors are extrated from DWT of the segmented face image. These characteristic vectors are normalized between +1 and -1, and are used as input vectors for the neural network. Simulation results show recognition rate of 100% on the learned system, and about 92% on the test images.
PDF

Object Recognition Method for Industrial Intelligent Robot (산업용 지능형 로봇의 물체 인식 방법)

Kim, Kye Kyung;Kang, Sang Seung;Kim, Joong Bae;Lee, Jae Yeon;Do, Hyun Min;Choi, Taeyong;Kyung, Jin Ho
- Journal of the Korean Society for Precision Engineering
- /
- v.30 no.9
- /
- pp.901-908
- /
- 2013
The introduction of industrial intelligent robot using vision sensor has been interested in automated factory. 2D and 3D vision sensors have used to recognize object and to estimate object pose, which is for packaging parts onto a complete whole. But it is not trivial task due to illumination and various types of objects. Object image has distorted due to illumination that has caused low reliability in recognition. In this paper, recognition method of complex shape object has been proposed. An accurate object region has detected from combined binary image, which has achieved using DoG filter and local adaptive binarization. The object has recognized using neural network, which is trained with sub-divided object class according to object type and rotation angle. Predefined shape model of object and maximal slope have used to estimate the pose of object. The performance has evaluated on ETRI database and recognition rate of 96% has obtained.
https://doi.org/10.7736/KSPE.2013.30.9.901 인용 PDF KSCI

Quantization Based Speaker Normalization for DHMM Speech Recognition System (DHMM 음성 인식 시스템을 위한 양자화 기반의 화자 정규화)

신옥근
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.4
- /
- pp.299-307
- /
- 2003
There have been many studies on speaker normalization which aims to minimize the effects of speaker's vocal tract length on the recognition performance of the speaker independent speech recognition system. In this paper, we propose a simple vector quantizer based linear warping speaker normalization method based on the observation that the vector quantizer can be successfully used for speaker verification. For this purpose, we firstly generate an optimal codebook which will be used as the basis of the speaker normalization, and then the warping factor of the unknown speaker will be extracted by comparing the feature vectors and the codebook. Finally, the extracted warping factor is used to linearly warp the Mel scale filter bank adopted in the course of MFCC calculation. To test the performance of the proposed method, a series of recognition experiments are conducted on discrete HMM with thirteen mono-syllabic Korean number utterances. The results showed that about 29% of word error rate can be reduced, and that the proposed warping factor extraction method is useful due to its simplicity compared to other line search warping methods.
PDF KSCI

Face Recognition Using a Phase Difference for Images (영상의 위상 차를 이용한 얼굴인식)

Kim, Seon-Jong;Koo, Tak-Mo;Sung, Hyo-Kyung;Choi, Heung-Moon
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.35S no.6
- /
- pp.81-87
- /
- 1998
This paper proposes an efficient face recognition system using phase difference between the face images. We use a Karhunen-Loeve transform for image compression and reconstruction, and obtain the phase difference by using normalized inner product of the two compressed images. The proposed system is rotation and light-invariant due to using the normalized phase difference, and somewhat shift-invariant due to applying the cosine function. The faster recognition than the conventional system and incremental training is possible in the proposed system. Simulations are conducted on the ORL images of 40 persons, in which each person has 10 facial images, and the result shows that the faster recognition than conventional recognizer using convolution network under the same recognition error rate of 8% does.
PDF

Implementation of Real-time Recognition System for Continuous Korean Sign Language(KSL) mixed with Korean Manual Alphabet(KMA) (지문자를 포함한 연속된 한글 수화의 실시간 인식 시스템 구현)

Lee, Chan-Su;Kim, Jong-Sung;Park, Gyu-Tae;Jang, Won;Bien, Zeung-Nam
- Journal of the Korean Institute of Telematics and Electronics C
- /
- v.35C no.6
- /
- pp.76-87
- /
- 1998
This paper deals with a system which recognizes dynmic hand gestures, Korean Sign Language(KSL), mixed with static hand gesture, Korean Manual Alphabet(KMA), continuously. Recognition of continuous hand gestures is very difficult for lack of explicit tokens indicating beginning and ending of signs and for complexity of each gesture. In this paper, state automata is used for segmenting sequential signs into individual ones, and basic elements of KSL and KMA, which consist of 14 hand directions, 23 hand postures and 14 hand orientations are used for recognition of complex gestures under consideration of expandability. Using a pair of CyberGlove and Polhemus sensor, this system recognizes 131 Korean signs and 31 KMA's in real-time with recognition rate 94.3% for KSL excluding no recognition case and 96.7% for KMA.
PDF

Face Recognition Using Fisherface Algorithm and Fixed Graph Matching (Fisherface 알고리즘과 Fixed Graph Matching을 이용한 얼굴 인식)

Lee, Hyeong-Ji;Jeong, Jae-Ho
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.38 no.6
- /
- pp.608-616
- /
- 2001
This paper proposes a face recognition technique that effectively combines fixed graph matching (FGM) and Fisherface algorithm. EGM as one of dynamic link architecture uses not only face-shape but also the gray information of image, and Fisherface algorithm as a class specific method is robust about variations such as lighting direction and facial expression. In the proposed face recognition adopting the above two methods, linear projection per node of an image graph reduces dimensionality of labeled graph vector and provides a feature space to be used effectively for the classification. In comparison with a conventional EGM, the proposed approach could obtain satisfactory results in the perspectives of recognition speeds. Especially, we could get higher average recognition rate of 90.1% than the conventional methods by hold-out method for the experiments with the Yale Face Databases and Olivetti Research Laboratory (ORL) Databases.
PDF

Search Result 2,809, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)