• Title/Summary/Keyword: Recognition Improve

Search Result 2,186, Processing Time 0.036 seconds

Multimodal Emotion Recognition using Face Image and Speech (얼굴영상과 음성을 이용한 멀티모달 감정인식)

  • Lee, Hyeon Gu;Kim, Dong Ju
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.1
    • /
    • pp.29-40
    • /
    • 2012
  • A challenging research issue that has been one of growing importance to those working in human-computer interaction are to endow a machine with an emotional intelligence. Thus, emotion recognition technology plays an important role in the research area of human-computer interaction, and it allows a more natural and more human-like communication between human and computer. In this paper, we propose the multimodal emotion recognition system using face and speech to improve recognition performance. The distance measurement of the face-based emotion recognition is calculated by 2D-PCA of MCS-LBP image and nearest neighbor classifier, and also the likelihood measurement is obtained by Gaussian mixture model algorithm based on pitch and mel-frequency cepstral coefficient features in speech-based emotion recognition. The individual matching scores obtained from face and speech are combined using a weighted-summation operation, and the fused-score is utilized to classify the human emotion. Through experimental results, the proposed method exhibits improved recognition accuracy of about 11.25% to 19.75% when compared to the most uni-modal approach. From these results, we confirmed that the proposed approach achieved a significant performance improvement and the proposed method was very effective.

A Study on the Recognition of Daegu Citizen on Fire Safety (대구시민의 소방안전의식에 관한 연구)

  • 방창훈;최영상
    • Fire Science and Engineering
    • /
    • v.18 no.3
    • /
    • pp.25-29
    • /
    • 2004
  • The purpose of this study is to supply basis data that can improve the recognition of Daegu citizens on fire safety through the investigation of their recognition on fire safety. To achieve the purpose, 553 Daegu citizens in Daegu city were interviewed from Oct. 2 to 15, 2003. The results of this study are as follows; Citizens had thought little the recognition of our society on fire safety and Daegu subway fire accident caused very big effect in citizens' fire safety recognition. Citizens did not trust about fire fighting facilities that is established in the building. Method to elevate recognition for citizens' fire safety recognition appeared by TV, newspaper, etc public information reinforcement (30.7%), establishment and operation of the fire safety experience center (24.6%), fire safety education reinforcement in school (20.8%).

Neural Network Approach to Sensor Fusion System for Improving the Recognition Performance of 3D Objects (3차원 물체의 인식 성능 향상을 위한 감각 융합 신경망 시스템)

  • Dong Sung Soo;Lee Chong Ho;Kim Ji Kyoung
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.3
    • /
    • pp.156-165
    • /
    • 2005
  • Human being recognizes the physical world by integrating a great variety of sensory inputs, the information acquired by their own action, and their knowledge of the world using hierarchically parallel-distributed mechanism. In this paper, authors propose the sensor fusion system that can recognize multiple 3D objects from 2D projection images and tactile informations. The proposed system focuses on improving recognition performance of 3D objects. Unlike the conventional object recognition system that uses image sensor alone, the proposed method uses tactual sensors in addition to visual sensor. Neural network is used to fuse the two sensory signals. Tactual signals are obtained from the reaction force of the pressure sensors at the fingertips when unknown objects are grasped by four-fingered robot hand. The experiment evaluates the recognition rate and the number of learning iterations of various objects. The merits of the proposed systems are not only the high performance of the learning ability but also the reliability of the system with tactual information for recognizing various objects even though the visual sensory signals get defects. The experimental results show that the proposed system can improve recognition rate and reduce teeming time. These results verify the effectiveness of the proposed sensor fusion system as recognition scheme for 3D objects.

Phoneme Similarity Error Correction System using Bhattacharyya Distance Measurement Method (바타챠랴 거리 측정법을 이용한 음소 유사율 오류 보정 개선 시스템)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.6
    • /
    • pp.73-80
    • /
    • 2010
  • Vocabulary recognition system is providing inaccurate vocabulary and similar phoneme recognition due to reduce recognition rate. It's require method of similar phoneme recognition unrecognized and efficient feature extraction process. Therefore in this paper propose phoneme likelihood error correction improvement system using based on phoneme feature Bhattacharyya distance measurement. Phoneme likelihood is monophone training data phoneme using HMM feature extraction method, similar phoneme is induced recognition able to accurate phoneme using Bhattacharyya distance measurement. They are effective recognition rate improvement. System performance comparison as a result of recognition improve represent 1.2%, 97.91% by Euclidean distance measurement and dynamic time warping(DTW) system.

Gaussian Model Optimization using Configuration Thread Control In CHMM Vocabulary Recognition (CHMM 어휘 인식에서 형상 형성 제어를 이용한 가우시안 모델 최적화)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.10 no.7
    • /
    • pp.167-172
    • /
    • 2012
  • In vocabulary recognition using HMM(Hidden Markov Model) by model for the observation of a discrete probability distribution indicates the advantages of low computational complexity, but relatively low recognition rate has the disadvantage that require sophisticated smoothing process. Gaussian mixtures in order to improve them with a continuous probability density CHMM (Continuous Hidden Markov Model) model is proposed for the optimization of the library system. In this paper is system configuration thread control in recognition Gaussian mixtures model provides a model to optimize of the CHMM vocabulary recognition. The result of applying the proposed system, the recognition rate of 98.1% in vocabulary recognition, respectively.

Transformer-based transfer learning and multi-task learning for improving the performance of speech emotion recognition (음성감정인식 성능 향상을 위한 트랜스포머 기반 전이학습 및 다중작업학습)

  • Park, Sunchan;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.515-522
    • /
    • 2021
  • It is hard to prepare sufficient training data for speech emotion recognition due to the difficulty of emotion labeling. In this paper, we apply transfer learning with large-scale training data for speech recognition on a transformer-based model to improve the performance of speech emotion recognition. In addition, we propose a method to utilize context information without decoding by multi-task learning with speech recognition. According to the speech emotion recognition experiments using the IEMOCAP dataset, our model achieves a weighted accuracy of 70.6 % and an unweighted accuracy of 71.6 %, which shows that the proposed method is effective in improving the performance of speech emotion recognition.

Combining Machine Learning Techniques with Terrestrial Laser Scanning for Automatic Building Material Recognition

  • Yuan, Liang;Guo, Jingjing;Wang, Qian
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.361-370
    • /
    • 2020
  • Automatic building material recognition has been a popular research interest over the past decade because it is useful for construction management and facility management. Currently, the extensively used methods for automatic material recognition are mainly based on 2D images. A terrestrial laser scanner (TLS) with a built-in camera can generate a set of coloured laser scan data that contains not only the visual features of building materials but also other attributes such as material reflectance and surface roughness. With more characteristics provided, laser scan data have the potential to improve the accuracy of building material recognition. Therefore, this research aims to develop a TLS-based building material recognition method by combining machine learning techniques. The developed method uses material reflectance, HSV colour values, and surface roughness as the features for material recognition. A database containing the laser scan data of common building materials was created and used for model training and validation with machine learning techniques. Different machine learning algorithms were compared, and the best algorithm showed an average recognition accuracy of 96.5%, which demonstrated the feasibility of the developed method.

  • PDF

Creation of a Voice Recognition-Based English Aided Learning Platform

  • Hui Xu
    • Journal of Information Processing Systems
    • /
    • v.20 no.4
    • /
    • pp.491-500
    • /
    • 2024
  • In hopes of resolving the issue of poor quality of information input for teaching spoken English online, the study creates an English teaching assistance model based on a recognition algorithm named dynamic time warping (DTW) and relies on automated voice recognition technology. In hopes of improving the algorithm's efficiency, the study modifies the speech signal's time-domain properties during the pre-processing stage and enhances the algorithm's performance in terms of computational effort and storage space. Finally, a simulation experiment is employed to evaluate the model application's efficacy. The study's revised DTW model, which achieves recognition rates of above 95% for all phonetic symbols and tops the list for cloudy consonant recognition with rates of 98.5%, 98.8%, and 98.7% throughout the three tests, respectively, is demonstrated by the study's findings. The enhanced model for DTW voice recognition also presents higher efficiency and requires less time for training and testing. The DTW model's KS value, which is the highest among the models analyzed in the KS value analysis, is 0.63. Among the comparative models, the model also presents the lowest curve position for both test functions. This shows that the upgraded DTW model features superior voice recognition capabilities, which could significantly improve online English education and lead to better teaching outcomes.

Retrieve System for Performance support of Vocabulary Clustering Model In Continuous Vocabulary Recognition System (연속 어휘 인식 시스템에서 어휘 클러스터링 모델의 성능 지원을 위한 검색 시스템)

  • Oh, Sang Yeob
    • Journal of Digital Convergence
    • /
    • v.10 no.9
    • /
    • pp.339-344
    • /
    • 2012
  • Established continuous vocabulary recognition system improved recognition rate by using decision tree based tying modeling method. However, since system model cannot support the retrieve of phoneme data, it is hard to secure the accuracy. In order to improve this problem, we remodeled a system that could retrieve probabilistic model from continuous vocabulary clustering model to phoneme unit. Therefore in this paper showed 95.88%of recognition rate in system performance.

Telephone Speech Recognition with Data-Driven Selective Temporal Filtering based on Principal Component Analysis

  • Jung Sun Gyun;Son Jong Mok;Bae Keun Sung
    • Proceedings of the IEEK Conference
    • /
    • 2004.08c
    • /
    • pp.764-767
    • /
    • 2004
  • The performance of a speech recognition system is generally degraded in telephone environment because of distortions caused by background noise and various channel characteristics. In this paper, data-driven temporal filters are investigated to improve the performance of a specific recognition task such as telephone speech. Three different temporal filtering methods are presented with recognition results for Korean connected-digit telephone speech. Filter coefficients are derived from the cepstral domain feature vectors using the principal component analysis.

  • PDF