Search | Korea Science

The Effect of FIR Filtering and Spectral Tilt on Speech Recognition with MFCC (FIR 필터링과 스펙트럼 기울이기가 MFCC를 사용하는 음성인식에 미치는 효과)

Lee, Chang-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.5 no.4
- /
- pp.363-371
- /
- 2010
In an effort to enhance the quality of feature vector classification and thereby reduce the recognition error rate for the speaker-independent speech recognition, we study the effect of spectral tilt on the Fourier magnitude spectrum en route to the extraction of MFCC. The effect of FIR filtering on the speech signal on the speech recognition is also investigated in parallel. Evaluation of the proposed methods are performed by two independent ways of the Fisher discriminant objective function and speech recognition test by hidden Markov model with fuzzy vector quantization. From the experiments, the recognition error rate is found to show about 10% relative improvements over the conventional method by an appropriate choice of the tilt factor.
PDF KSCI

An Emotion Recognition Method using Facial Expression and Speech Signal (얼굴표정과 음성을 이용한 감정인식)

고현주;이대종;전명근
- Journal of KIISE:Software and Applications
- /
- v.31 no.6
- /
- pp.799-807
- /
- 2004
In this paper, we deal with an emotion recognition method using facial images and speech signal. Six basic human emotions including happiness, sadness, anger, surprise, fear and dislike are investigated. Emotion recognition using the facial expression is performed by using a multi-resolution analysis based on the discrete wavelet transform. And then, the feature vectors are extracted from the linear discriminant analysis method. On the other hand, the emotion recognition from speech signal method has a structure of performing the recognition algorithm independently for each wavelet subband and then the final recognition is obtained from a multi-decision making scheme.
PDF KSCI

Extraction of User Preference for Video Stimuli Using EEG-Based User Responses

Moon, Jinyoung;Kim, Youngrae;Lee, Hyungjik;Bae, Changseok;Yoon, Wan Chul
- ETRI Journal
- /
- v.35 no.6
- /
- pp.1105-1114
- /
- 2013
Owing to the large number of video programs available, a method for accessing preferred videos efficiently through personalized video summaries and clips is needed. The automatic recognition of user states when viewing a video is essential for extracting meaningful video segments. Although there have been many studies on emotion recognition using various user responses, electroencephalogram (EEG)-based research on preference recognition of videos is at its very early stages. This paper proposes classification models based on linear and nonlinear classifiers using EEG features of band power (BP) values and asymmetry scores for four preference classes. As a result, the quadratic-discriminant-analysis-based model using BP features achieves a classification accuracy of 97.39% (${\pm}0.73%$), and the models based on the other nonlinear classifiers using the BP features achieve an accuracy of over 96%, which is superior to that of previous work only for binary preference classification. The result proves that the proposed approach is sufficient for employment in personalized video segmentation with high accuracy and classification power.
https://doi.org/10.4218/etrij.13.0113.0194 인용 PDF KSCI

Implementation of HMM Based Speech Recognizer with Medium Vocabulary Size Using TMS320C6201 DSP (TMS320C6201 DSP를 이용한 HMM 기반의 음성인식기 구현)

Jung, Sung-Yun;Son, Jong-Mok;Bae, Keun-Sung
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.1E
- /
- pp.20-24
- /
- 2006
In this paper, we focused on the real time implementation of a speech recognition system with medium size of vocabulary considering its application to a mobile phone. First, we developed the PC based variable vocabulary word recognizer having the size of program memory and total acoustic models as small as possible. To reduce the memory size of acoustic models, linear discriminant analysis and phonetic tied mixture were applied in the feature selection process and training HMMs, respectively. In addition, state based Gaussian selection method with the real time cepstral normalization was used for reduction of computational load and robust recognition. Then, we verified the real-time operation of the implemented recognition system on the TMS320C6201 EVM board. The implemented recognition system uses memory size of about 610 kbytes including both program memory and data memory. The recognition rate was 95.86% for ETRI 445DB, and 96.4%, 97.92%, 87.04% for three kinds of name databases collected through the mobile phones.
PDF KSCI

Discrimination of a Pleasant and an Unpleasant State by Autoregressive Models from EEG Signals (EEG신호의 시계열분석에 의한 쾌, 불쾌 감성분류에 관한 연구)

Im, Seong-Sik;Kim, Jin-Ho;Kim, Chi-Yong
- Journal of the Ergonomics Society of Korea
- /
- v.17 no.1
- /
- pp.67-77
- /
- 1998
The objective of this study is to extract information from electroencephalogram(EEG) signals with which we can discriminate mental states. Seven university students were participated in this study. Ten stimuli based on IAPS (International Affective Picture Systems) Were presented at random according to the experimental schedule. 8-channel ($O_1$, $O_2$, $F_3$, $F_4$, $F_7$, $F_8$, $FP_1$, and $FP_2$)EEG signals were recorded at a sampling rate of 204.8 Hz for visual stimuli and analyzed. After random ten sequential stimuli presentation, the subject subjectively assessed the stimulus by scaling from -5 to 5. If the stimulus was the best and the worst, it was scored 5 and -5, respectively. Only maximum and minimum scored-EEG signals within each subject were selected on the basis of subjectively assessment for analysis. EEG signals were transformed into feature objects based on scalar autoregressive model coefficients. They were classified with Discriminant Analysis for each channel. The features produced results with the best classification accuracy of 85.7 % in $O_1$ and $O_2$ for visual stimuli. This study could be extended to establish an algorithm which quantify and classify emotions evoked by visual stimulus using autoregressive models.
PDF

Parallel Model Feature Extraction to Improve Performance of a BCI System (BCI 시스템의 성능 개선을 위한 병렬 모델 특징 추출)

Chum, Pharino;Park, Seung-Min;Sim, Kwee-Bo
- Journal of Institute of Control, Robotics and Systems
- /
- v.19 no.11
- /
- pp.1022-1028
- /
- 2013
It is well knowns that based on the CSP (Common Spatial Pattern) algorithm, the linear projection of an EEG (Electroencephalography) signal can be made to spaces that optimize the discriminant between two patterns. Sharing disadvantages from linear time invariant systems, CSP suffers from the non-stationary nature of EEGs causing the performance of the classification in a BCI (Brain-Computer Interface) system to drop significantly when comparing the training data and test data. The author has suggested a simple idea based on the parallel model of CSP filters to improve the performance of BCI systems. The model was tested with a simple CSP algorithm (without any elaborate regularizing methods) and a perceptron learning algorithm as a classifier to determine the improvement of the system. The simulation showed that the parallel model could improve classification performance by over 10% compared to conventional CSP methods.
https://doi.org/10.5302/J.ICROS.2013.13.1930 인용 PDF KSCI

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

Bae Hyojoon;Jung Sungyun;Son Jongmok;Kwon Hongseok;Kim Siho;Bae Keunsung
- Proceedings of the IEEK Conference
- /
- summer
- /
- pp.391-394
- /
- 2004
This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5kbytes for program code. Maximum required time of 29.2ms for processing a frame of 32ms of speech validates real-time operation of the implemented system.
PDF

Feature extraction based on DWT and GA for Gesture Recognition of EPIC Sensor Signals (EPIC 센서 신호의 제스처 인식을 위한 이산 웨이블릿 변환과 유전자 알고리즘 기반 특징 추출)

Ji, Sang-Hun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Kim, Young-Chul
- Proceedings of the Korea Information Processing Society Conference
- /
- 2016.04a
- /
- pp.612-615
- /
- 2016
본 논문에서는 EPIC(Electric Potential Integrated Circuit) 센서를 통해 추출된 동작신호에 대해 이산 웨이블릿 변환（Discrete Wavelet Transform : DWT)과 선형 판별분석（Linear Discriminant Analysis : LDA), Support Vector Machine(SVM)을 사용하는 동작 분류 시스템을 제안한다. EPIC 센서 신호에 대해 이산 웨이블릿 변환을 사용하여 웨이블릿 계수인 근사계수(approximation coefficients)와 상세계수(detail coefficients)를 구한 후, 각각의 웨이블릿 계수에 대해 특징 파라미터를 추출한다. 이 때, 특징 파라미터는 14개의 통계적 특징 추출 파라미터 중에 유전자 알고리즘(Genetic Algorithm : GA)을 통하여 선택한 우수한 특징 파라미터이다. 웨이블릿 계수들에서 추출한 특징 파라미터는 선형 판별분석을 적용하여 차원을 축소하고 SVM의 훈련 및 분류에 사용한다. 실험결과, 4가지 동작에 대한 EPIC 센서 신호분류에서 제안된 방법의 분류율이 99.75%로 원신호에 대한 HMM 분류율 97% 보다 높은 정확률을 보여주었다.
https://doi.org/10.3745/PKIPS.y2016m04a.612 인용 PDF

Multivariate Analysis of EEG Signal using Intervention Models (개입모형을 이용한 EEG 신호의 다변량 분석에 관한 연구)

Im, Seong-Sik;Kim, Jin-Ho;Kim, Chi-Yong;Hwang, Min-Cheol
- Journal of the Ergonomics Society of Korea
- /
- v.18 no.1
- /
- pp.13-24
- /
- 1999
The objective of the study is to discriminate EEG(electroencephalogram) due to emotional changes. Emotion was evoked by the series of auditory stimuli which were selected from the natural sounds in the sound effect collection of compact disc. Seventeen university students participated and experienced positive or negative emotions by six auditory stimuli with intermission between stimuli. Temporal EEG ($T_3$, $T_4$, $T_5$, and $T_6$) was recorded at the same time and a subjective test was performed on the eleven point scales after the experiment. The maximum and minimum scores of the EEG among six stimuli EEG were analyzed for discrimination of emotion. The EEG signals were transformed into feature objects based on scalar intervention model coefficients. Auditory stimulus was considered as intervention variable. They were classified by Discriminant Analysis for each channel. The features showed results with the best classification accuracy of 91.2 % in $T_4$ for auditory stimuli. This study could be extended to establish an algorithm which quantifies and classifies emotions evoked by auditory stimulus using time-series models.
PDF

Using GAs to Support Feature Weighting and Instance Selection in CBR for CRM

Ahn, Hyun-Chul;Kim, Kyoung-Jae;Han, In-Goo
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 2005.11a
- /
- pp.516-525
- /
- 2005
Case-based reasoning (CBR) has been widely used in various areas due to its convenience and strength in complex problem solving. Generally, in order to obtain successful results from CBR, effective retrieval of useful prior cases for the given problem is essential. However, designing a good matching and retrieval mechanism for CBR systems is still a controversial research issue. Most prior studies have tried to optimize the weights of the features or selection process of appropriate instances. But, these approaches have been performed independently until now. Simultaneous optimization of these components may lead to better performance than in naive models. In particular, there have been few attempts to simultaneously optimize the weight of the features and selection of the instances for CBR. Here we suggest a simultaneous optimization model of these components using a genetic algorithm (GA). We apply it to a customer classification model which utilizes demographic characteristics of customers as inputs to predict their buying behavior for a specific product. Experimental results show that simultaneously optimized CBR may improve the classification accuracy and outperform various optimized models of CBR as well as other classification models including logistic regression, multiple discriminant analysis, artificial neural networks and support vector machines.
PDF

Search Result 200, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)