• 제목/요약/키워드: LPC Coefficients

검색결과 79건 처리시간 0.025초

강인한 정합과정을 이용한 텍스트 종속 화자인식에 관한 연구 (A study on the text-dependent speaker recognition system Using a robust matching process)

  • 이한구;이기성
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2002년도 합동 추계학술대회 논문집 정보 및 제어부문
    • /
    • pp.605-608
    • /
    • 2002
  • A text-dependent speaker recognition system using a robust matching process is studied. The feature histogram of LPC cepstral coefficients for matching is used. The matching process uses mixture network with penalty scores. Using probability and shape comparison of two feature histograms, similarity values are obtained. The experiment results will be shown to show the effectiveness of the proposed algorithm.

  • PDF

연결 축소 회로망을 이용한 EMG 신호 기능 인식에 관한 연구 (A Study on EMG Functional Recognition Vsing Reduced-Connection Network)

  • 조정호;최윤호
    • 대한의용생체공학회:의공학회지
    • /
    • 제11권2호
    • /
    • pp.249-256
    • /
    • 1990
  • In this study, LPC cepstrum coefficients are used as feature vector extracted from AR model of EMG signal, and a reduced-connection network whlch has reduced connection between nodes is constructed to classify and recognize EMG functional classes. The proposed network reduces learning time and improves system stability Therefore it is Ehown that the proposed network is appropriate in recognizing function of EMG signal.

  • PDF

IMPLEMENTATION OF REAL TIME RELP VOCODER ON THE TMS320C25 DSP CHIP

  • Kwon, Kee-Hyeon;Chong, Jong-Wha
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.957-962
    • /
    • 1994
  • Real-time RELP vocoder is implemented on the TMS320C25 DSP chip. The implemented system is IBM-PC add-on board and composed of analog in/out unit, DSP unit, memoy unit, IBM-PC interface unit and its supporting assembly software. Speech analyzer and synthesizer is implimented by DSP assembly software. Speech parameters such as LPC coefficients, base-band residuals, and signal gains is extracted by autocorrelation method and inverse filter and synthesized by spectral folding method and direct form synthesis filter in this board. And then, real-time RELP vocoder with 9.6Kbps is simulated by down-loading method in the DSP program RAM.

  • PDF

화자인식에 효과적인 특징벡터에 관한 비교연구 (A study on Effective Feature Parameters Comparison for Speaker Recognition)

  • 박태선;김상진;문광;한민수
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.145-148
    • /
    • 2003
  • In this paper, we carried out comparative study about various feature parameters for the effective speaker recognition such as LPC, LPCC, MFCC, Log Area Ratio, Reflection Coefficients, Inverse Sine, and Delta Parameter. We also adopted cepstral liftering and cepstral mean subtraction methods to check their usefulness. Our recognition system is HMM based one with 4 connected-Korean-digit speech database. Various experimental results will help to select the most effective parameter for speaker recognition.

  • PDF

Speaker-Dependent Emotion Recognition For Audio Document Indexing

  • Hung LE Xuan;QUENOT Georges;CASTELLI Eric
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2004년도 ICEIC The International Conference on Electronics Informations and Communications
    • /
    • pp.92-96
    • /
    • 2004
  • The researches of the emotions are currently great interest in speech processing as well as in human-machine interaction domain. In the recent years, more and more of researches relating to emotion synthesis or emotion recognition are developed for the different purposes. Each approach uses its methods and its various parameters measured on the speech signal. In this paper, we proposed using a short-time parameter: MFCC coefficients (Mel­Frequency Cepstrum Coefficients) and a simple but efficient classifying method: Vector Quantification (VQ) for speaker-dependent emotion recognition. Many other features: energy, pitch, zero crossing, phonetic rate, LPC... and their derivatives are also tested and combined with MFCC coefficients in order to find the best combination. The other models: GMM and HMM (Discrete and Continuous Hidden Markov Model) are studied as well in the hope that the usage of continuous distribution and the temporal behaviour of this set of features will improve the quality of emotion recognition. The maximum accuracy recognizing five different emotions exceeds $88\%$ by using only MFCC coefficients with VQ model. This is a simple but efficient approach, the result is even much better than those obtained with the same database in human evaluation by listening and judging without returning permission nor comparison between sentences [8]; And this result is positively comparable with the other approaches.

  • PDF

적응적인 확장된 코드북을 이용한 분할 벡터 양자화기 구조의 ISF 양자화기 개선 (A Method For Improvement Of Split Vector Quantization Of The ISF Parameters Using Adaptive Extended Codebook)

  • 임종하;정규혁;홍기봉;이인성
    • 한국음향학회지
    • /
    • 제30권1호
    • /
    • pp.1-8
    • /
    • 2011
  • 본 논문에서는 ISF 계수의 순서화 성질을 이용하여 분할구조 벡터양자화기의 단점을 보완하여 ISF 계수 양자화의 성능을 높이는 알고리듬을 제안하고, 이를 이용한 광대역 음성 부호화기용 ISF 계수 양자화기를 설계한다. 16차 이상의 광대역 코덱의 ISF 계수는 계산량과 메모리 사용을 줄이기 위해서 분할구조의 벡터 양자화기를 사용한다. 분할구조 양자화기는 ISF 계수간의 상관도를 충분히 활용하지 못하는 단점이 발생한다. 제안하는 알고리듬은 이러한 단점을 극복하기 위하여 ISF 계수의 순서화 성질을 이용한다. ISF 계수의 순서화 성질을 이용하여 각 서브벡터의 불필요한 코드북 (Codebook Redundancy)을 검색할 수 있다. 이러한 불필요한 코드북은 ISF 계수의 순서화 성질, ISF 계수 예측과정과 기존 코드북의 보간법 (Interpolation)을 통해 적응적인 확장된 코드북으로 교체되어 양자화기의 성능을 향상시킨다. 제안된 알고리듬은 기존의 분할구조 양자화기에서 사용되지 못했던 17 %가량의 불필요한 코드북 인덱스를 적응적인 확장된 코드북에 할당하여, 표준화된 코덱인 AMR-WB의 ISF 계수 양자화기에 비해서 주파수 왜곡 관점에서 약 2 bit 가량의 이득을 보는 결과를 얻었다.

뇌파를 이용한 4가지 감정 분류에 관한 연구 (A Study on Classification of Four Emotions using EEG)

  • 강동기;김동준;김흥환;고한우
    • 한국감성과학회:학술대회논문집
    • /
    • 한국감성과학회 2001년도 추계학술대회 논문집
    • /
    • pp.87-90
    • /
    • 2001
  • 본 연구에서는 감성 평가 시스템에 가장 적합한 파라미터를 찾기 위하여 3가지 뇌파 파라미터를 이용하여 감정 분류 실험을 하였다. 뇌파 파라미터는 선형예측기계수(linear predictor coefficients)와 FFT 스펙트럼 및 AR 스펙트럼의 밴드별 상호상관계수(cross-correlation coefficients)를 이용하였으며, 감정은 relaxation, joy, sadness, irritation으로 설정하였다. 뇌파 데이터는 대학의 연극동아리 학생 4명을 대상으로 수집하였으며, 전극 위치는 Fp1, Fp2, F3, F4, T3, T4, P3, P4, O1, O2를 사용하였다. 수집된 뇌파 데이터는 전처리를 거친 후 특징 파라미터를 추출하고 패턴 분류기로 사용된 신경회로망(neural network)에 입력하여 감정 분류를 하였다. 감정 분류실험 결과 선형예측기계수를 이용하는 것이 다른 2가지 보다 좋은 성능을 나타내었다.

  • PDF

Cepstrum 계수와 Frequency Sensitive Competitive Learning 신경회로망을 이용한 한국어 인식. (Korean Digit Recognition Using Cepstrum coefficients and Frequency Sensitive Competitive Learning)

  • 이수혁;조성원;최경삼
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1994년도 추계학술대회 논문집 학회본부
    • /
    • pp.329-331
    • /
    • 1994
  • In this paper, we present a speaker-dependent Korean Isolated digit recognition system. At the preprocessing step, LPC cepstral coefficients are extracted from speech signal, and are used as the input of a Frequency Sensitive Competitive Learning(FSCL) neural network. We carried out the postprocessing based on the winning-neuron histogram. Experimetal results Indicate the possibility of commercial auto-dial telephones.

  • PDF

변형된 Dynamic Averaging 방법을 이용한 단독어인식 (Isolated Word Recognition using Modified Dynamic Averaging Method)

  • 정의봉;고영혁;이종악
    • 한국음향학회지
    • /
    • 제10권2호
    • /
    • pp.23-28
    • /
    • 1991
  • 본 논문을 특정화자에 대한 단독어 음성 인식에 대한 연구이다. 우리는 표준패턴으로서 변형된 dynamic linear averaging 방법을 이용한 DTW 음성 인식 시스템을 제안한다. 57개의 모든 도시명이 인식 대상 어휘로 선정되었고 12차 LPC cepstram 계수를 특징계수로 사용하였다. 이 논문은 표준패턴으로 변형된 dynamic linear averaging 방법을 이용하여 인식 실험을 한것 이외에도 같은 데이터 같은 조건상에서 causal 방법과 dynamic averaging방법, linear averaging방법, clustering 방법을 이용하여 실험하였다. 실험결과로 변형시킨 dynamic linear averaging 방법을 이용한 DTW 음성인식이 97.6%로 가장 좋은 인식율을 보였다.

  • PDF

Speaker Identification Based on Incremental Learning Neural Network

  • Heo, Kwang-Seung;Sim, Kwee-Bo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제5권1호
    • /
    • pp.76-82
    • /
    • 2005
  • Speech signal has various features of speakers. This feature is extracted from speech signal processing. The speaker is identified by the speaker identification system. In this paper, we propose the speaker identification system that uses the incremental learning based on neural network. Recorded speech signal through the microphone is blocked to the frame of 1024 speech samples. Energy is divided speech signal to voiced signal and unvoiced signal. The extracted 12 orders LPC cpestrum coefficients are used with input data for neural network. The speakers are identified with the speaker identification system using the neural network. The neural network has the structure of MLP which consists of 12 input nodes, 8 hidden nodes, and 4 output nodes. The number of output node means the identified speakers. The first output node is excited to the first speaker. Incremental learning begins when the new speaker is identified. Incremental learning is the learning algorithm that already learned weights are remembered and only the new weights that are created as adding new speaker are trained. It is learning algorithm that overcomes the fault of neural network. The neural network repeats the learning when the new speaker is entered to it. The architecture of neural network is extended with the number of speakers. Therefore, this system can learn without the restricted number of speakers.