• Title/Summary/Keyword: 화자 검출

Search Result 112, Processing Time 0.025 seconds

Pitch Period Detection Algorithm Using Modified AMDF (변형된 AMDF를 이용한 피치 주기 검출 알고리즘)

  • Seo Hyun-Soo;Bae Sang-Bum;Kim Nam-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.1
    • /
    • pp.23-28
    • /
    • 2006
  • Pitch period that is a important factor in speech signal processing is used in various applications such as speech recognition, speaker identification, speech analysis and synthesis. So many pitch detection algorithms have been studied until now. AMDF which is one of pitch period detection algorithms chooses the time interval from valley point to valley point as pitch period. In selection of valley point to detect pitch period, complexity of the algorithm is increased. So in this paper we proposed the simple algorithm using rotation transform of AMDF that detects global minimum valley point as pitch period of speech signal and compared it with existing methods through simulation.

Pitch Detection Using Variable Bandwidth LPF (가변 대역폭 LPF를 이용한 피치 검출)

  • Keum, Hong;Baek, Guem-Ran;Bae, Myung-Jin;Jang, Ho-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.5
    • /
    • pp.77-82
    • /
    • 1994
  • In speech signal processing, it is very important to detect the pitch exactly. Although various methods for detecting the pitch of speech signals have been developed, it is difficult to exactly extract the pitch for wide range of speakers and various utterances. Thus we propose a new pitch detection algorithm which takes advantage of the G-peak extraction. It is a method to detect the pitch period of the voiced signals by finding MZCI (maximum zero-crossing interval) of the G-peak which is defined as cut-off bandwidth rate of LPF (low pass filter). This algorithm performs robustly with a gross error rate of 3.63% even in 0 dB SNR environement. The gross error rate for clean speech is only 0.18%. Also it is able to process all courses with high speed.

  • PDF

A Study on Pitch Period Detection of Speech Signal Using Modified AMDF (변형된 AMDF를 이용한 음성 신호의 피치 주기 검출에 관한 연구)

  • Seo, Hyun-Soo;Bae, Sang-Bum;Kim, Nam-Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.1
    • /
    • pp.515-519
    • /
    • 2005
  • Pitch period that is a important factor in speech signal processing is used in various applications such as speech recognition, speaker identification, speech analysis and synthesis. So many pitch detection algoritms have been studied until now. AMDF which is one of pitch period detection algorithms chooses the time interval from valley point to valley point as pitch period. In selection of valley point to detect pitch period, complexity of the algoritm is increased. So in this paper we proposed the simple algorithm using modified AMDF that detects global minimum valley point as pitch period of speech signal and compared existing methods with it through simulation.

  • PDF

Design and Implementation of a Real-Time Lipreading System Using PCA & HMM (PCA와 HMM을 이용한 실시간 립리딩 시스템의 설계 및 구현)

  • Lee chi-geun;Lee eun-suk;Jung sung-tae;Lee sang-seol
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1597-1609
    • /
    • 2004
  • A lot of lipreading system has been proposed to compensate the rate of speech recognition dropped in a noisy environment. Previous lipreading systems work on some specific conditions such as artificial lighting and predefined background color. In this paper, we propose a real-time lipreading system which allows the motion of a speaker and relaxes the restriction on the condition for color and lighting. The proposed system extracts face and lip region from input video sequence captured with a common PC camera and essential visual information in real-time. It recognizes utterance words by using the visual information in real-time. It uses the hue histogram model to extract face and lip region. It uses mean shift algorithm to track the face of a moving speaker. It uses PCA(Principal Component Analysis) to extract the visual information for learning and testing. Also, it uses HMM(Hidden Markov Model) as a recognition algorithm. The experimental results show that our system could get the recognition rate of 90% in case of speaker dependent lipreading and increase the rate of speech recognition up to 40~85% according to the noise level when it is combined with audio speech recognition.

  • PDF

A Crowd Noise Reduction Model for Speech Signal processing (음성 신호처리를 위한 군중잡음 제거 모델)

  • 안용운;김중환;김상철
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10d
    • /
    • pp.502-504
    • /
    • 2002
  • 군중잡음(crowd noise)이 발생하는 환경에서 음성 통화 및 화자 인식을 할 때에는 음성에 파열음이나 마찰음과 같은 유색잡음(colored noise)이 부가되어 원래 음성이 왜곡된다. 이와 같이 왜곡된 음성 신호를 처리할 때에는 군중잡음을 제거하는 과정이 반드시 필요하다. 본 논문에서는 군중잡음의 특성을 분석하고, 그 결과를 이용하여 음성 신호처리 시에 효과적으로 군중잡음만을 제거할 수 있는 모델을 제안한다. 제안된 모델은 시간 영역에서는 침묵 구간을 검출하여 마찰음과 파열음을 제거하는 과정과 주파수 영역에서는 잡음 평균을 생성하고 이를 이용한 스펙트럼 차감법(spectral subtraction)으로 군중 잡음을 제거하는 과정으로 이루어진다.

  • PDF

시청각 기반 HRI 컴포넌트 상용화 서비스 현장 성능 평가 및 환경분석

  • Ji, Su-Yeong;Kim, Hye-Jin;Kim, Do-Hyeong;Yun, Ho-Seop
    • Information and Communications Magazine
    • /
    • v.25 no.4
    • /
    • pp.16-21
    • /
    • 2008
  • 본고에서는 지능형 서비스 로봇의 상용화 단계에서 가장 현실적으로 적용 가능한 대표적인 HRI기술(얼굴검출, 화자 성별구별, 음원추적)에 대하여 상용화 서비스 현장에서의 성능평가 결과를 제공하고, 현장을 분석하여 사용자에게 가이드라인을 제공함과 동시에 최적의 상용화 서비스 제공을 위한 사용자와 로봇간 HRI 기준 및, 공공로봇 플랫폼 적용을 통한 로봇 서비스의 Needs 파악과 상품기획력의 극대화를 목적으로 성능평가에 따른 환경분석을 제안한다.

Robust Endpoint Detection Algorithm For Speaker Verification (화자인식을 위한 강인한 끝점 검출 알고리즘)

  • Jung Dae Sung;Kim Jung Gon;Kim Hyung Soon
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.137-140
    • /
    • 2003
  • In this paper, we propose a robust endpoint detection algorithm for speaker verification. Proposed algorithm uses energy and cepstral distance parameters, and it replaces the detected endpoints with endpoints of voiced speech, when the estimated signal-to-noise ratio (SNR) is low. Experimental results show that proposed algorithm is superior to energy-based endpoint detection algorithm.

  • PDF

A Study on the Postprocessing In Keyword Spotting (Keyword spotting에서의 후처리 과정에 관한 연구)

  • 송화전
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.249-252
    • /
    • 1994
  • Keyword spotting 이란 음성인식의 한 분야로서 컴퓨터가 사람의 음성을 입력받아 이 음성에 미리 정해진 특정단어 또는복수개의 단어들 중 어느 것이 포함되어 있는지의 여부를 찾아내고 이 단어를 식별해 내는 작업을 의미한다. 이러한 keyword spotting 시스템의 인식 오류들을 감소시키는 방법의 하나로 keyword spotting 시스템에 후처리 과정을 둠으로써 잘못 검출된 keyword 들을 제거시키는 방법이 사용될 수 있다. 본 논문에서는 keyword로 검출된 영역에 대한 keyword 모델의 likeihood와 그 여역에 대한 filler 모델의 likelihood의 ratio 와 second best keyword 의 likelihood 그리고, 끝점존재 영역의 구간 길이등 여러 가지 정보를 이용한 후처리과정을 검토하고 인식실험을 통해 이들의 성능을 비교하였다. 6개의 부서명을 keyword로 하는 불특정 화자 keyword spotting 실험을 수행한 결과 baseline 시스템의 경우 고립단어 및 문장 형태의 음성에 대해 95.0%의 keyword 인식률을 얻었으며, 본 논문에서 검토된 네 가지 후처리 방법에 의해 keyword rejection ratio를 0%에서 5%까지 변화시켜 나갈 경우 최저 95.3%에서 최고 97.1%까지 keyword 인식률이 향상된 결과를 얻었다. 특히 성능과 계산량을 종합적으로 고려할 때 끝점 존재 영역의 구간 길이 정보를 이용한 방법이 가장 우수하였다.

  • PDF

The Study on Speaker Change Verification Using SNR based weighted KL distance (SNR 기반 가중 KL 거리를 활용한 화자 변화 검증에 관한 연구)

  • Cho, Joon-Beom;Lee, Ji-eun;Lee, Kyong-Rok
    • Journal of Convergence for Information Technology
    • /
    • v.7 no.6
    • /
    • pp.159-166
    • /
    • 2017
  • In this paper, we have experimented to improve the verification performance of speaker change detection on broadcast news. It is to enhance the input noisy speech and to apply the KL distance $D_s$ using the SNR-based weighting function $w_m$. The basic experimental system is the verification system of speaker change using GMM-UBM based KL distance D(Experiment 0). Experiment 1 applies the input noisy speech enhancement using MMSE Log-STSA. Experiment 2 applies the new KL distance $D_s$ to the system of Experiment 1. Experiments were conducted under the condition of 0% MDR in order to prevent missing information of speaker change. The FAR of Experiment 0 was 71.5%. The FAR of Experiment 1 was 67.3%, which was 4.2% higher than that of Experiment 0. The FAR of experiment 2 was 60.7%, which was 10.8% higher than that of experiment 0.

Frequency Domain Acoustic Echo Suppression Based on Boundary Condition (주파수 영역에서 구간조건을 이용한 음향학적 반향 제거)

  • Lee, Kyu-Ho;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.5
    • /
    • pp.162-166
    • /
    • 2009
  • In this paper, we propose a novel approach of an acoustic echo cancellation (AEC) algorithm which is differently adopted in the relevant period condition by the suppression parameter of a parametric wiener filter (PWF). The PWF uses the suppression parameter to compensate uncertainty of acoustic echo signal estimation. The existing PWF method using the fixed suppression parameter derives the distortion of the near-end signal at the double-talk. To solve this problem, the boundary condition is devised using decision of the double-talk detection (DTD) algorithm and voice activity detector (VAD). The boundary condition makes it possible to treat differently depending on the case of the single-talk and double-talk. According to the experimental results, the proposed approach is found to be effective for acoustic echo cancellation using the boundary condition.