• 제목/요약/키워드: Voice signal

검색결과 433건 처리시간 0.031초

엔트로피 차와 신호의 에너지에 기반한 잡음환경에서의 음성검출 (Voice Activity Detection Based on Signal Energy and Entropy-difference in Noisy Environments)

  • 하동경;조석제;진강규;신옥근
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제32권5호
    • /
    • pp.768-774
    • /
    • 2008
  • In many areas of speech signal processing such as automatic speech recognition and packet based voice communication technique, VAD (voice activity detection) plays an important role in the performance of the overall system. In this paper, we present a new feature parameter for VAD which is the product of energy of the signal and the difference of two types of entropies. For this end, we first define a Mel filter-bank based entropy and calculate its difference from the conventional entropy in frequency domain. The difference is then multiplied by the spectral energy of the signal to yield the final feature parameter which we call PEED (product of energy and entropy difference). Through experiments. we could verify that the proposed VAD parameter is more efficient than the conventional spectral entropy based parameter in various SNRs and noisy environments.

스펙트럼 형태 불변 실시간 음성 변환 시스템 (Spectral Shape Invariant Real-time Voice Change System)

  • 김원구
    • 한국지능시스템학회논문지
    • /
    • 제15권1호
    • /
    • pp.48-52
    • /
    • 2005
  • 본 논문에서는 음성의 스펙트럼 형태는 유지하면서 음성을 기계적인 음성으로 변환시키기는 실시간 음성 변환 방법을 제안하였다. 이러한 목적을 위하여 LPC 분석 및 합성 방법을 사용하여 변환된 음성의 스펙트럼은 유지하였고 합성된 음성의 피치는 자유롭게 변경되도록 하였다. 제안된 방법에서는 변환된 음성이 보다 자연스럽게 들리게 하기 위하여 여기 신호 발생기에 이득 정합 방법을 적용하였다. 제안된 방법의 성능을 평가하기 위하여 음성 변환 실험을 수행하였다. 실험 결과에서 원 음성 신호는 원 화자의 신원을 알기가 어려운 기계적인 음성 신호로 바뀌는 것을 알 수 있었고 피치의 심한 변화에도 변환된 음성의 의미는 정확히 전달될 수 있었다. 제안된 시스템은 시스템의 실시간으로 구현될 수 있는지 확인하기 위하여 TI TMS320C6711DSK 보드를 사용하여 구현되었다.

임베디드 시스템에서 사용 가능한 적응형 MFCC 와 Deep Learning 기반의 음성인식 (Voice Recognition-Based on Adaptive MFCC and Deep Learning for Embedded Systems)

  • 배현수;이호진;이석규
    • 제어로봇시스템학회논문지
    • /
    • 제22권10호
    • /
    • pp.797-802
    • /
    • 2016
  • This paper proposes a noble voice recognition method based on an adaptive MFCC and deep learning for embedded systems. To enhance the recognition ratio of the proposed voice recognizer, ambient noise mixed into the voice signal has to be eliminated. However, noise filtering processes, which may damage voice data, diminishes the recognition ratio. In this paper, a filter has been designed for the frequency range within a voice signal, and imposed weights are used to reduce data deterioration. In addition, a deep learning algorithm, which does not require a database in the recognition algorithm, has been adapted for embedded systems, which inherently require small amounts of memory. The experimental results suggest that the proposed deep learning algorithm and HMM voice recognizer, utilizing the proposed adaptive MFCC algorithm, perform better than conventional MFCC algorithms in its recognition ratio within a noisy environment.

기동무기체계에서의 통신을 위한 음성신호 포착 연구 (A Study of Voice signal Capture for communication in the AFV)

  • 김석봉;이성태
    • 한국군사과학기술학회지
    • /
    • 제6권1호
    • /
    • pp.81-90
    • /
    • 2003
  • In the military communication environment, it is very difficult to obtain clear voice signal due to the high level noise. The purpose of this study is to find out the best body spot to get the vocal chords signal by measuring the skin or the bone conducting vibrations of different body positions within the noise environment. Based on the experimental study, it was found out that the measurement of sound signal within the ear is the best way to get the voice which comes from the vocal chords and this method can prevent the interruption of noise. This study will give the effective voice communication method in the high noise environment and be applicable to military purpose.

낮은 차원의 벡터 변환을 통한 음성 변환 (Voice conversion using low dimensional vector mapping)

  • 이기승;도원;윤대희
    • 전자공학회논문지S
    • /
    • 제35S권4호
    • /
    • pp.118-127
    • /
    • 1998
  • In this paper, we propose a voice personality transformation method which makes one person's voice sound like another person's voice. In order to transform the voice personality, vocal tract transfer function is used as a transformation parameter. Comparing with previous methods, the proposed method can obtain high-quality transformed speech with low computational complexity. Conversion between the vocal tract transfer functions is implemented by a linear mapping based on soft clustering. In this process, mean LPC cepstrum coefficients and mean removed LPC cepstrum modeled by the low dimensional vector are used as transformation parameters. To evaluate the performance of the proposed method, mapping rules are generated from 61 Korean words uttered by two male and one female speakers. These rules are then applied to 9 sentences uttered by the same persons, and objective evaluation and subjective listening tests for the transformed speech are performed.

  • PDF

A Study on Stable Motion Control of Humanoid Robot with 24 Joints Based on Voice Command

  • Lee, Woo-Song;Kim, Min-Seong;Bae, Ho-Young;Jung, Yang-Keun;Jung, Young-Hwa;Shin, Gi-Soo;Park, In-Man;Han, Sung-Hyun
    • 한국산업융합학회 논문집
    • /
    • 제21권1호
    • /
    • pp.17-27
    • /
    • 2018
  • We propose a new approach to control a biped robot motion based on iterative learning of voice command for the implementation of smart factory. The real-time processing of speech signal is very important for high-speed and precise automatic voice recognition technology. Recently, voice recognition is being used for intelligent robot control, artificial life, wireless communication and IoT application. In order to extract valuable information from the speech signal, make decisions on the process, and obtain results, the data needs to be manipulated and analyzed. Basic method used for extracting the features of the voice signal is to find the Mel frequency cepstral coefficients. Mel-frequency cepstral coefficients are the coefficients that collectively represent the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. The reliability of voice command to control of the biped robot's motion is illustrated by computer simulation and experiment for biped walking robot with 24 joint.

자동차 잡음 환경에서 웨이브렛 밴드 엔트로피 앙상블 분석을 이용한 음성구간 검출 알고리즘 (Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments)

  • 이기현;이윤정;김명남
    • 한국멀티미디어학회논문지
    • /
    • 제16권9호
    • /
    • pp.1005-1017
    • /
    • 2013
  • 음성구간 검출은 음성과 잡음이 섞인 신호에서 음성구간과 비음성구간을 구분하는 과정으로 음성 향상을 위한 신호처리에서 매우 중요한 과정이다. 지금까지 음성구간 검출에 관한 많은 연구가 있었지만, 낮은 신호 대 잡음비 환경이나 자동차 잡음과 같은 시간에 따른 변화가 심한 잡음환경에서는 좋은 성능을 보이지 못하였다. 본 논문에서는 웨이브렛 밴드 엔트로피 기반의 앙상블 분산과 소프트 문턱치 기법을 이용한 새로운 음성구간 검출 알고리듬을 제안하였다. 제안한 알고리듬의 성능을 비교 평가하기 위하여 자동차 잡음이 있는 다양한 신호 대 잡음비 환경에서 실험을 수행하였으며 실험결과, 제안한 방법의 우수한 성능을 확인할 수 있었다.

개선된 음성 기록 제어 장치의 개발 (Development of advanced voice recorder control system)

  • 장중식
    • 한국시뮬레이션학회:학술대회논문집
    • /
    • 한국시뮬레이션학회 1999년도 추계학술대회 논문집
    • /
    • pp.272-277
    • /
    • 1999
  • The necessity of voice recording device was increased using voice signal IC with designed LSI/VLSI. The control unit which developed here voice recorder has low power dissipation, portable, and comfortable using voice source. However, the Korea voice recorder abilities far behind of foreign products for its performance and size on sailing. So we used Chua circuit to improvement voice quality abilities after minimize power supply device and circuit by designing voice recording device into lower power dissipation power circuit.

  • PDF

얼굴 영상 및 음성신호 측정을 통한 신장 수지침 효과 분석 기법의 제안 (A Proposal for Effect Analysis Techniques of Kidney Hand Acupuncture through Face Image and Voice Signal Measurement)

  • 김봉현;조동욱
    • 한국통신학회논문지
    • /
    • 제37권3C호
    • /
    • pp.217-223
    • /
    • 2012
  • 본 논문에서는 얼굴 영상 및 음성신호 변화를 측정하는 기술을 적용하여 신장에 해당하는 수지침 자극에 따른 효과를 분석하는 기법을 제안하고자 한다. 이를 위해 신장 수지침 자극 전과 후의 얼굴 영상과 음성을 각각 수집하고 영상신호 분석 실험에서는 신장 관련 영역인 지각(턱) 부위의 색상 변화를 측정하였다. 또한, 음성신호 분석 실험에서는 신장과 관련된 음성신호 분석 요소인 1 포먼트 주파수 대역폭과 Shimmer값의 변화를 측정하였다. 실험을 통해 신장 수지침 자극에 따른 지각 부위의 흑색, 1 포먼트 주파수 대역폭 및 Shimmer 측정값이 감소하는 현상을 나타냈다. 최종적으로 실험 결과에 대한 통계적 유의성 분석을 통해 얼굴 영상 및 음성신호 측정 기법에 의한 신장 수지침 효과를 객관적으로 입증하고자 한다.

탐색영역의 중요도에 따라 적응적인 탐색을 이용한 고속 움직임 예측 알고리즘 (A Fast Motion Estimation Algorithm using Adaptive Search According to Importance of Search Ranges)

  • 김태환;김종남;정신일
    • 한국멀티미디어학회논문지
    • /
    • 제18권4호
    • /
    • pp.437-442
    • /
    • 2015
  • Voice activity detection is very important process that voice activity separated form noisy speech signal for speech enhance. Over the past few years, many studies have been made on voice activity detection, but it has poor performance in low signal to noise ratio environment or fickle noise such as car noise. In this paper, it proposed new voice activity detection algorithm using ensemble variance based on wavelet band entropy and soft thresholding method. We conduct a survey in a lot of signal to noise ratio environment of car noise to evaluate performance of the proposed algorithm and confirmed performance of the proposed algorithm.