Search | Korea Science

Voice Activity Detection Method Using Psycho-Acoustic Model Based on Speech Energy Maximization in Noisy Environments (잡음 환경에서 심리음향모델 기반 음성 에너지 최대화를 이용한 음성 검출 방법)

Choi, Gab-Keun;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.5
- /
- pp.447-453
- /
- 2009
This paper introduces the method for detect voices and exact end point at low SNR by maximizing voice energy. Conventional VAD (Voice Activity Detection) algorithm estimates noise level so it tends to detect the end point inaccurately. Moreover, because it uses relatively long analysis range for reflecting temporal change of noise, computing load too high for application. In this paper, the SEM-VAD (Speech Energy Maximization-Voice Activity Detection) method which uses psycho-acoustical bark scale filter banks to maximize voice energy within frames is introduced. Stable threshold values are obtained at various noise environments (SNR 15 dB, 10 dB, 5 dB, 0 dB). At the test for voice detection in car noisy environment, PHR (Pause Hit Rate) was 100%accurate at every noise environment, and FAR (False Alarm Rate) shows 0% at SNR15 dB and 10 dB, 5.6% at SNR5 dB and 9.5% at SNR0 dB.
https://doi.org/10.7776/ASK.2009.28.5.447 인용 PDF KSCI

Frame Reliability Weighting for Robust Speech Recognition (프레임 신뢰도 가중에 의한 강인한 음성인식)

조훈영;김락용;오영환
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.3
- /
- pp.323-329
- /
- 2002
This paper proposes a frame reliability weighting method to compensate for a time-selective noise that occurs at random positions of speech signal contaminating certain parts of the speech signal. Speech frames have different degrees of reliability and the reliability is proportional to SNR (signal-to noise ratio). While it is feasible to estimate frame Sl? by using the noise information from non-speech interval under a stationary noisy situation, it is difficult to obtain noise spectrum for a time-selective noise. Therefore, we used statistical models of clean speech for the estimation of the frame reliability. The proposed MFR (model-based frame reliability) approximates frame SNR values using filterbank energy vectors that are obtained by the inverse transformation of input MFCC (mal-frequency cepstral coefficient) vectors and mean vectors of a reference model. Experiments on various burnt noises revealed that the proposed method could represent the frame reliability effectively. We could improve the recognition performance by using MFR values as weighting factors at the likelihood calculation step.
PDF KSCI

Incorporation of IMM-based Feature Compensation and Uncertainty Decoding (IMM 기반 특징 보상 기법과 불확실성 디코딩의 결합)

Kang, Shin-Jae;Han, Chang-Woo;Kwon, Ki-Soo;Kim, Nam-Soo
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.37 no.6C
- /
- pp.492-496
- /
- 2012
This paper presents a decoding technique for speech recognition using uncertainty information from feature compensation method to improve the speech recognition performance in the low SNR condition. Traditional feature compensation algorithms have difficulty in estimating clean feature parameters in adverse environment. Those algorithms focus on the point estimation of desired features. The point estimation of feature compensation method degrades speech recognition performance when incorrectly estimated features enter into the decoder of speech recognition. In this paper, we apply the uncertainty information from well-known feature compensation method, such as IMM, to the recognition engine. Applied technique shows better performance in the Aurora-2 DB.
https://doi.org/10.7840/KICS.2012.37.6C.492 인용 PDF KSCI

Improved Timing Synchronization Using Phase Difference between Subcarriers in OFDMA Uplink Systems (OFDMA 상향 링크 시스템에서 부반송파간 위상 회전 정보를 이용한 개선된 시간 동기 추정 알고리즘)

Lee, Sung-Eun;Hong, Dae-Sik
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.46 no.2
- /
- pp.46-52
- /
- 2009
In this paper, the timing estimator based on the principle of the best linear unbiased estimator (BLUE) is proposed in OFDMA uplink systems. The proposed timing estimator exploits the phase information of the differential correlation between adjacent subcarriers. The differential correlation can extract the information about timing offset and mitigate the distortion of the signal caused by the frequency selectivity of channel. Compared with conventional methods, the proposed estimator shows more accurate capability in estimation. In addition, the estimator is hardly affected by the distortion caused by the frequency selectivity of channel. Simulation results confirm that the proposed estimator shows a small error mean and a relatively small error variance. In addition, the performance of the estimator is evaluated by means of SNR loss. It is shown by simulations that the SNR loss of the proposed estimator by estimation errors is less than 0.4 dB for the SNR values between 0 and 20 dB. This might indicate that the proposed estimator is suitable for the timing synchronization of multiple users in OFDMA uplink systems.
PDF KSCI

Improved generalized cross correlation-phase transform based time delay estimation by frequency domain autocorrelation (주파수영역 자기상관에 의한 위상 변환 일반 상호 상관 시간 지연 추정기 성능 개선)

Lim, Jun-Seok;Cheong, MyoungJun;Kim, Seongil
- The Journal of the Acoustical Society of Korea
- /
- v.37 no.5
- /
- pp.271-275
- /
- 2018
There are several methods for estimating the time delay between incoming signals to two sensors. Among them, the GCC-PHAT (Generalized Cross Correlation-Phase Transform) method, which estimates the relative delay from the signal whitening and the cross-correlation between the different signal inputs to the two sensors, is a traditionally well known method for achieving stable performance. In this paper, we have identified a part of GCC-PHAT that can improve the periodicity. Also, we apply the auto-correlation method that is widely used as a method to improve the periodicity. Comparing the proposed method with the GCC-PHAT method, we show that the proposed method improves the mean square error performance by 5 dB ~ 15 dB at the SNR above 0 dB for white Gaussian signal source and also show that the method improves the mean square error performance up to 15 dB at the SNR above 2 dB for the color signal source.
https://doi.org/10.7776/ASK.2018.37.5.271 인용 PDF KSCI

Robust Endpoint Detection for Bimodal System in Noisy Environments (잡음환경에서의 바이모달 시스템을 위한 견실한 끝점검출)

오현화;권홍석;손종목;진성일;배건성
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.40 no.5
- /
- pp.289-297
- /
- 2003
The performance of a bimodal system is affected by the accuracy of the endpoint detection from the input signal as well as the performance of the speech recognition or lipreading system. In this paper, we propose the endpoint detection method which detects the endpoints from the audio and video signal respectively and utilizes the signal to-noise ratio (SNR) estimated from the input audio signal to select the reliable endpoints to the acoustic noise. In other words, the endpoints are detected from the audio signal under the high SNR and from the video signal under the low SNR. Experimental results show that the bimodal system using the proposed endpoint detector achieves satisfactory recognition rates, especially when the acoustic environment is quite noisy.
PDF KSCI

An Adaptive Wind Noise Reduction Method Based on a priori SNR Estimation for Speech Eenhancement (음성 강화를 위한 a priori SNR 추정기반 적응 바람소리 저감 방법)

Seo, Ji-Hun;Lee, Seok-Pil
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.64 no.12
- /
- pp.1756-1760
- /
- 2015
This paper focuses on a priori signal to noise ratio (SNR) estimation method for the speech enhancement. There are many researches for speech enhancement with several ambient noise cancellation methods. The method based on spectral subtraction (SS) which is widely used in noise reduction has a trade-off between the performance and the distortion of the signals. So the need of adaptive method like an estimated a priori SNR being able to making a high performance and low distortion is increasing. The decision directed (DD) approach is used to determine a priori SNR in noisy speech signals. A priori SNR is estimated by using only the magnitude components and consequently follows a posteriori SNR with one frame delay. We propose a modified a priori SNR estimator and the weighted rational transfer function for speech enhancement with wind noises. The experimental result shows the performance of our proposed estimator is better Perceptual Evaluation of Speech Quality scores (PESQ, ITU-T P.862) compare to the conventional DD approach-based systems and different noise reduction methods.
https://doi.org/10.5370/KIEE.2015.64.12.1756 인용 PDF KSCI

Reconstruction of 3D Ultrasound Image from Freehand Scanned Frames Using Lateral Correlation Functions (측면거리 상관함수를 이용한 수동주사 초음파 영상 프레임들로부터의 3차원 영상 재구성)

이준호;김남철;김상현
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.26 no.8B
- /
- pp.1152-1160
- /
- 2001
본 논문에서는 수동주사로 얻은 2차원 연속 프레임으로 3차원 초음파 영상 재구성을 위한 연속 프레임의 프레임간 거리 추정 방법을 제안하였다. 수동주사로 얻은 연속 프레임은 프레임간 거리가 불균일하기 때문에 이를 그대로 3차원 영상으로 재구성하면 실제 인체 장기의 형태와 다른 영상을 얻게 된다. 제안한 알고리듬에서는 연속 프레임의 프레임간 거리를 추정하기 위하여 매 프레임의 블록별 측면거리 상관함수를 얻고, 측면거리 상관함수들이 프로브의 진행축과 초음파 센서의 배열축이 이루는 평면상에서 등방성을 가진다는 가정 하에 인접 프레임 내의 각 블록간의 프레임간 거리를 추정하였다. 인접 프레임간 추정거리는 프레임 내에서 블록단위로 추정된 프레임간 거리를 평균하여 얻었다. 실험 결과, 제안한 알고리듬의 프레임간 추정거리 곡선은 기준 진행거리 상관함수를 이용한 방법의 추정거리 곡선에 비해서 실제 프레임간 거리 곡선에 가까웠고, SNR 비교에서 제안한 방법이 기존의 방법에 비해 좋은 결과를 보였다. 그리고, 기존의 알고리듬에 비해 제안한 알고리듬으로 재구성한 3차원 영상이 원영상에 더 흡사한 것을 볼 수 있었다.
PDF

Enhancement of Signal to Noise Ratio for High Frequency Square-Wave Injection Sensorless Drive with Regulation of Induced High Frequency Current Ripple (고주파 전류 맥동 제어를 통한 신호 주입 센서리스 방법의 신호 대 잡음 비(SNR) 개선)

Kim, DongOuk;Kwon, Yong-Cheol;Sul, Seung-Ki
- Proceedings of the KIPE Conference
- /
- 2013.11a
- /
- pp.167-168
- /
- 2013
신호 주입 센서리스 구동 시, 인버터의 비선형성으로 인한 주입 전압 왜곡현상은 전류 신호 정보의 SNR을 떨어뜨리게 된다. 이로 인하여 회전자 위치를 추정하는 과정에서 오차가 발생하는 문제점이 발생한다. 본 논문에서는 인버터의 비선형성이 주입 전압에 미치는 영향을 분석하고, 전류 신호 정보의 SNR을 개선하기 위하여 고주파 전류 맥동의 크기를 일정하게 제어하는 주입 전압을 인가하는 방법을 제안한다. 실험을 통하여, 제안된 방법의 성능을 검증하였다.
PDF

An Improved Speech Absence Probability Estimation based on Environmental Noise Classification (환경잡음분류 기반의 향상된 음성부재확률 추정)

Son, Young-Ho;Park, Yun-Sik;An, Hong-Sub;Lee, Sang-Min
- The Journal of the Acoustical Society of Korea
- /
- v.30 no.7
- /
- pp.383-389
- /
- 2011
In this paper, we propose a improved speech absence probability estimation algorithm by applying environmental noise classification for speech enhancement. The previous speech absence probability required to seek a priori probability of speech absence was derived by applying microphone input signal and the noise signal based on the estimated value of a posteriori SNR threshold. In this paper, the proposed algorithm estimates the speech absence probability using noise classification algorithm which is based on Gaussian mixture model in order to apply the optimal parameter each noise types, unlike the conventional fixed threshold and smoothing parameter. Performance of the proposed enhancement algorithm is evaluated by ITU-T P.862 PESQ (perceptual evaluation of speech quality) and composite measure under various noise environments. It is verified that the proposed algorithm yields better results compared to the conventional speech absence probability estimation algorithm.
https://doi.org/10.7776/ASK.2011.30.7.383 인용 PDF KSCI

Search Result 134, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)