• 제목/요약/키워드: 음성 특징 추출

검색결과 310건 처리시간 0.023초

Performance Improvement of Speech Recognizer in Noisy Environments Based on Auditory Modeling (청각 구조를 이용한 잡음 음성의 인식 성능 향상)

  • Jung, Ho-Young;Kim, Do-Yeong;Un, Chong-Kwan;Lee, Soo-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • 제14권5호
    • /
    • pp.51-57
    • /
    • 1995
  • In this paper, we study a noise-robust feature extraction method of speech signal based on auditory modeling. The auditory model consists of a basilar membrane, a hair cell model and spectrum output stage. Basilar membrane model describes a response characteristic of membrane according to vibration in speech wave, and is represented as a band-pass filter bank. Hair cell model describes a neural transduction according to displacements of the basilar membrane. It responds adaptively to relative values of input and plays an important role for noise-robustness. Spectrum output stage constructs a mean rate spectrum using the average firing rate of each channel. And we extract feature vectors using a mean rate spectrum. Simulation results show that when auditory-based feature extraction is used, the speech recognition performance in noisy environments is improved compared to other feature extraction methods.

  • PDF

Segmentation of Continuous Speech based on PCA of Feature Vectors (주요고유성분분석을 이용한 연속음성의 세그멘테이션)

  • 신옥근
    • The Journal of the Acoustical Society of Korea
    • /
    • 제19권2호
    • /
    • pp.40-45
    • /
    • 2000
  • In speech corpus generation and speech recognition, it is sometimes needed to segment the input speech data without any prior knowledge. A method to accomplish this kind of segmentation, often called as blind segmentation, or acoustic segmentation, is to find boundaries which minimize the Euclidean distances among the feature vectors of each segments. However, the use of this metric alone is prone to errors because of the fluctuations or variations of the feature vectors within a segment. In this paper, we introduce the principal component analysis method to take the trend of feature vectors into consideration, so that the proposed distance measure be the distance between feature vectors and their projected points on the principal components. The proposed distance measure is applied in the LBDP(level building dynamic programming) algorithm for an experimentation of continuous speech segmentation. The result was rather promising, resulting in 3-6% reduction in deletion rate compared to the pure Euclidean measure.

  • PDF

Feature analysis of deaf students' English language by frequency (청각장애학생의 영어 발성 주파수별 특징 분석)

  • Lee, Gun-Min;Park, Hye Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권4호
    • /
    • pp.819-828
    • /
    • 2014
  • In this paper, we analyze the characteristics of the English vocalization of deaf students and present the basic data for the development of personalized English learning aid tools that reflect its features. We visited hearing special schools in Seoul and Daegu and recorded English vocalization of the deaf students in order to analyze the characteristics of deaf students' English vocalization. We analyzed the data by Praat program, an professional voice analysis program. The voice features of deaf students' English vocalization were extracted and then compared with those of non-deaf students' English vocalization.

A study on pitch detection for RUI emotion classification based on voice (RUI용 음성신호기반의 감정분류를 위한 피치검출기에 관한 연구)

  • Byun, Sung-Woo;Lee, Seok-Pil
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 한국방송공학회 2015년도 하계학술대회
    • /
    • pp.421-424
    • /
    • 2015
  • 컴퓨터 기술이 발전하고 컴퓨터 사용이 일반화 되면서 휴먼 인터페이스에 대한 많은 연구들이 진행되어 왔다. 휴먼 인터페이스에서 감정을 인식하는 기술은 컴퓨터와 사람간의 상호작용을 위해 중요한 기술이다. 감정을 인식하는 기술에서 분류 정확도를 높이기 위해 특징벡터를 정확하게 추출하는 것이 중요하다. 본 논문에서는 정확한 피치검출을 위하여 음성신호에서 음성 구간과 비 음성구간을 추출하였으며, Speech Processing 분야에서 사용되는 전 처리 기법인 저역 필터와 유성음 추출 기법, 후처리 기법인 Smoothing 기법을 사용하여 피치 검출을 수행하고 비교하였다. 그 결과, 전 처리 기법인 유성음 추출 기법과 후처리 기법인 Smoothing 기법은 피치 검출의 정확도를 높였고, 저역 필터를 사용한 경우는 피치 검출의 정확도가 떨어트렸다.

  • PDF

Speaker Independent Recognition Algorithm based on Parameter Extraction by MFCC applied Wiener Filter Method (위너필터법이 적용된 MFCC의 파라미터 추출에 기초한 화자독립 인식알고리즘)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • 제21권6호
    • /
    • pp.1149-1154
    • /
    • 2017
  • To obtain good recognition performance of speech recognition system under background noise, it is very important to select appropriate feature parameters of speech. The feature parameter used in this paper is Mel frequency cepstral coefficient (MFCC) with the human auditory characteristics applied to Wiener filter method. That is, the feature parameter proposed in this paper is a new method to extract the parameter of clean speech signal after removing background noise. The proposed method implements the speaker recognition by inputting the proposed modified MFCC feature parameter into a multi-layer perceptron network. In this experiments, the speaker independent recognition experiments were performed using the MFCC feature parameter of the 14th order. The average recognition rates of the speaker independent in the case of the noisy speech added white noise are 94.48%, which is an effective result. Comparing the proposed method with the existing methods, the performance of the proposed speaker recognition is improved by using the modified MFCC feature parameter.

Feature Parameter Extraction and Speech Recognition Using Matrix Factorization (Matrix Factorization을 이용한 음성 특징 파라미터 추출 및 인식)

  • Lee Kwang-Seok;Hur Kang-In
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • 제10권7호
    • /
    • pp.1307-1311
    • /
    • 2006
  • In this paper, we propose new speech feature parameter using the Matrix Factorization for appearance part-based features of speech spectrum. The proposed parameter represents effective dimensional reduced data from multi-dimensional feature data through matrix factorization procedure under all of the matrix elements are the non-negative constraint. Reduced feature data presents p art-based features of input data. We verify about usefulness of NMF(Non-Negative Matrix Factorization) algorithm for speech feature extraction applying feature parameter that is got using NMF in Mel-scaled filter bank output. According to recognition experiment results, we confirm that proposed feature parameter is superior to MFCC(Mel-Frequency Cepstral Coefficient) in recognition performance that is used generally.

An Emotion Recognition Method using Facial Expression and Speech Signal (얼굴표정과 음성을 이용한 감정인식)

  • 고현주;이대종;전명근
    • Journal of KIISE:Software and Applications
    • /
    • 제31권6호
    • /
    • pp.799-807
    • /
    • 2004
  • In this paper, we deal with an emotion recognition method using facial images and speech signal. Six basic human emotions including happiness, sadness, anger, surprise, fear and dislike are investigated. Emotion recognition using the facial expression is performed by using a multi-resolution analysis based on the discrete wavelet transform. And then, the feature vectors are extracted from the linear discriminant analysis method. On the other hand, the emotion recognition from speech signal method has a structure of performing the recognition algorithm independently for each wavelet subband and then the final recognition is obtained from a multi-decision making scheme.

Extraction and Analysis of Voice Feature Parameter of Chungbuk News Announcers (충북방송 뉴스 진행자의 음성적 특징 추출 및 분석)

  • Kim, Bong-Hyun;Lee, Se-Hwan;Ka, Min-Kyoung;Cho, Dong-Uk;J.Bae, Young-Lae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 한국정보처리학회 2009년도 추계학술발표대회
    • /
    • pp.363-364
    • /
    • 2009
  • 방송 산업이 기술적 구조적으로 발전하고 시청자의 수준 향상 및 문화 산업이 급변함에 따라 현대사회에서 방송 분야는 거대 성장을 거듭하고 있다. 이러한 방송 산업의 시대적 변화속에서 지속적으로 관심의 대상이 되고 있는 것이 시청자들의 수준 및 변화의 초점이며 이를 파악하여 원활한 방송의 진행을 주도해야 하는 것이 방송 진행자의 역할이다. 따라서 본 논문에서는 충북지역의 방송 3사에서 뉴스를 담당하고 있는 진행자에 대한 음성을 수집하여 다양한 음성 분석 요소들을 적용하고 이에 따른 결과값을 기반으로 방송 진행자의 음성에 대한 특징적 정보를 추출하는 실험을 수행하였다. 특히, 음성을 통해 전달할 수 있는 영향력을 분석하기 위해 피치, 지터, 짐머, 안정도, 및 스펙트로그램 등의 다양한 음성 분석 요소를 적용하였으며 결과값에 대한 비교, 분석을 수행하였다.

Visual Features and Shape Extraction of Voice Analysis Elements for Heart Diseases Diagnosis (심장 질환 진단을 위한 음성분석학적 요소의 시각 특징 및 형태 추출)

  • Kim, Bong-Hyun;Lee, Se-Hwan;Park, Sun-Ae;Ka, Min-Kyoung;Oh, Won-Geun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 한국정보처리학회 2007년도 춘계학술발표대회
    • /
    • pp.405-408
    • /
    • 2007
  • 건강관리 및 유지에 대한 현대인들의 관심이 증대되면서 삶의 질 향상을 추구하는 고령화 사회에서 성인병 및 만성질환은 매우 위험한 요인이 되고 있는 실정이다. 특히 심장 질환은 3대 사망률 중 한 부분을 차지하고 있을 정도로 위협적이며 비전염성 만성질환 중 하나이다. 그러나 모든 질환에 대한 대처 방법이 동일하듯이 조기 진단에 의한 질환 예방이 무엇보다 중요하다. 따라서 본 논문에서는 심장 질환자의 음성 신호를 획득하여 다양한 음성분석학적 요소 추출 및 분석을 통해 심장 질환과의 연관성을 파악하고자 한다. 이를 위해 본 논문에서는 기존의 음성 분석 요소에 대한 1차 실험을 검증하고 추가 음성 분석 요소들에 대한 2차 실험을 행하여 각각의 분석 요소들과 음성에 대한 형태학적 특징을 시각화하여 편리하게 심장 질환을 진단하는 기법들을 제시하고자 한다.

A Study on Error Correction Using Phoneme Similarity in Post-Processing of Speech Recognition (음성인식 후처리에서 음소 유사율을 이용한 오류보정에 관한 연구)

  • Han, Dong-Jo;Choi, Ki-Ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • 제6권3호
    • /
    • pp.77-86
    • /
    • 2007
  • Recently, systems based on speech recognition interface such as telematics terminals are being developed. However, many errors still exist in speech recognition and then studies about error correction are actively conducting. This paper proposes an error correction in post-processing of the speech recognition based on features of Korean phoneme. To support this algorithm, we used the phoneme similarity considering features of Korean phoneme. The phoneme similarity, which is utilized in this paper, rams data by mono-phoneme, and uses MFCC and LPC to extract feature in each Korean phoneme. In addition, the phoneme similarity uses a Bhattacharrya distance measure to get the similarity between one phoneme and the other. By using the phoneme similarity, the error of eo-jeol that may not be morphologically analyzed could be corrected. Also, the syllable recovery and morphological analysis are performed again. The results of the experiment show the improvement of 7.5% and 5.3% for each of MFCC and LPC.

  • PDF