Search | Korea Science

Voice Activity Detection in Noisy Environment using Speech Energy Maximization and Silence Feature Normalization (음성 에너지 최대화와 묵음 특징 정규화를 이용한 잡음 환경에 강인한 음성 검출)

Ahn, Chan-Shik;Choi, Ki-Ho
- Journal of Digital Convergence
- /
- v.11 no.6
- /
- pp.169-174
- /
- 2013
Speech recognition, the problem of performance degradation is the difference between the model training and recognition environments. Silence features normalized using the method as a way to reduce the inconsistency of such an environment. Silence features normalized way of existing in the low signal-to-noise ratio. Increase the energy level of the silence interval for voice and non-voice classification accuracy due to the falling. There is a problem in the recognition performance is degraded. This paper proposed a robust speech detection method in noisy environments using a silence feature normalization and voice energy maximize. In the high signal-to-noise ratio for the proposed method was used to maximize the characteristics receive less characterized the effects of noise by the voice energy. Cepstral feature distribution of voice / non-voice characteristics in the low signal-to-noise ratio and improves the recognition performance. Result of the recognition experiment, recognition performance improved compared to the conventional method.
https://doi.org/10.14400/JDPM.2013.11.6.169 인용 PDF

Performance Improvement of Speech Recognition Using Context and Usage Pattern Information (문맥 및 사용 패턴 정보를 이용한 음성인식의 성능 개선)

Song, Won-Moon;Kim, Myung-Won
- The KIPS Transactions:PartB
- /
- v.13B no.5 s.108
- /
- pp.553-560
- /
- 2006
Speech recognition has recently been investigated to produce more reliable recognition results in a noisy environment, by integrating diverse sources of information into the result derivation-level or producing new results through post-processing the prior recognition results. In this paper we propose a method which uses the user's usage patterns and the context information in speech command recognition for personal mobile devices to improve the recognition accuracy in a noisy environment. Sequential usage (or speech) patterns prior to the current command spoken are used to adjust the base recognition results. For the context information, we use the relevance between the current function of the device in use and the spoken command. Our experiment results show that the proposed method achieves about 50% of error correction rate over the base recognition system. It demonstrates the feasibility of the proposed method.
https://doi.org/10.3745/KIPSTB.2006.13B.5.553 인용 PDF KSCI

Performance Assessment of Speech Recogniger using Lombard Speech (롬바드 음성을 이용한 음성인식기의 성능 평가)

Jung, Sung-Yun;Chung, Hyun-Yeol;Kim, Kyung-Tae
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.5
- /
- pp.59-68
- /
- 1994
This paper describes the performance assessment test and analysis of test results on a Korean speech recognizer which recognizes Lombard effect received speech in noisy environment, as a basic performance assessment research. In the assessement test, standard speech data were first manipulated close to speech uttered in a noisy environment, and then performance assessment tests were carried out along with the assessment items (the type of noise, SNR) in two ways-one with Lombard effect received speech(LES), the other with not received(NLES). As a result, when 90% of recognition rate is set to be a recognition limit, it was achieved at 10dB SNR point with LES, while at 30dB with NLES. This 20dB of SNR difference indicates Lombard effect should be considered in real world assessment test. The type of noises didn't affect performance of recognizers in out tests. ANOVA analysis, in evaluating several kinds of recognizers, showed every assessment item affecting the recognition performance could be quantified.
PDF

A Study on Speech Recognition in a Running Automobile (주행중인 자동차 환경에서의 음성인식 연구)

양진우;김순협
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.5
- /
- pp.3-8
- /
- 2000
In this paper, we studied design and implementation of a robust speech recognition system in noisy car environment. The reference pattern used in the system is DMS(Dynamic Multi-Section). Two separate acoustic models, which are selected automatically depending on the noisy car environment for the speech in a car moving at below 80km/h and over 80km/h are proposed. PLP(Perceptual Linear Predictive) of order 13 is used for the feature vector and OSDP (One-Stage Dynamic Programming) is used for decoding. The system also has the function of editing the phone-book for voice dialing. The system yields a recognition rate of 89.75% for male speakers in SI (speaker independent) mode in a car running on a cemented express way at over 80km/h with a vocabulary of 33 words. The system also yields a recognition rate of 92.29% for male speakers in SI mode in a car running on a paved express way at over 80km/h.
PDF

The Voice Quality Improvement by Bone Conduction Feedback Compensation in Mobile Phone (골전도 피드백 보상에 의한 휴대전화 음질 향상)

Park, Hyung-Woo;Lim, Won-Seok;Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.6
- /
- pp.359-366
- /
- 2012
Today, people are exposed to the various noisy environments, such as in the buses, subway and supermarkets where there are a lot of people. The noise issue is getting more serious as people want to use portable sound equipment and mobile phones even under this noisy condition. People want to use the portable equipment to exchange the information freely and they set the volume as 15dB higher than the noise around them, which almost reach at 110 dB. That amount of sound can cause noise induced deafness to the users and another issue to the others as additional noise source. A Bone-conduction system can be a solution to reduce noise and enhance voice signal of mobile phone. In this paper, we propose the way of cancelling noise and enhancing speech signal of mobile phones, by installing bone-conduction feedback system with ordinary mobile phones. With this system, we can reduce the environment noise and enhance the voice quality of mobile phones. Using this method, we can enhance the signal by around 17 dB.
https://doi.org/10.7776/ASK.2012.31.6.359 인용 PDF KSCI

Improved speech enhancement of multi-channel Wiener filter using adjustment of principal subspace vector (다채널 위너 필터의 주성분 부공간 벡터 보정을 통한 잡음 제거 성능 개선)

Kim, Gibak
- The Journal of the Acoustical Society of Korea
- /
- v.39 no.5
- /
- pp.490-496
- /
- 2020
We present a method to improve the performance of the multi-channel Wiener filter in noisy environment. To build subspace-based multi-channel Wiener filter, in the case of single target source, the target speech component can be effectively estimated in the principal subspace of speech correlation matrix. The speech correlation matrix can be estimated by subtracting noise correlation matrix from signal correlation matrix based on the assumption that the cross-correlation between speech and interfering noise is negligible compared with speech correlation. However, this assumption is not valid in the presence of strong interfering noise and significant error can be induced in the principal subspace accordingly. In this paper, we propose to adjust the principal subspace vector using speech presence probability and the steering vector for the desired speech source. The multi-channel speech presence probability is derived in the principal subspace and applied to adjust the principal subspace vector. Simulation results show that the proposed method improves the performance of multi-channel Wiener filter in noisy environment.
https://doi.org/10.7776/ASK.2020.39.5.490 인용 PDF KSCI

Design and Implementation of the Security System for the Moving Object Detection (이동물체 검출을 위한 보안 시스템의 설계 및 구현)

안용학;안일영
- Convergence Security Journal
- /
- v.2 no.1
- /
- pp.77-86
- /
- 2002
In this paper, we propose a segmentation algorithm that can reliably separate moving objects from noisy background in the image sequence received from a camera at the fixed position. Image segmentation is one of the most difficult process in image processing and an adoption in the change of environment must be considered for the increase in the accuracy of the image. The proposed algorithm consists of four process : generation of the difference image between the input image and the reference image, removes the background noise using the background nois modeling to a difference image histogram, then selects the candidate initial region using local maxima to the difference image, and gradually expanding the connected regions, region by region, using the shape information. The test results show that the proposed algorithm can detect moving objects like intruders very effectively in the noisy environment.
PDF

Method for Spectral Enhancement by Binary Mask for Speech Recognition Enhancement Under Noise Environment (잡음환경에서 음성인식 성능향상을 위한 바이너리 마스크를 이용한 스펙트럼 향상 방법)

Choi, Gab-Keun;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.7
- /
- pp.468-474
- /
- 2010
The major factor that disturbs practical use of speech recognition is distortion by the ambient and channel noises. Generally, the ambient noise drops the performance and restricts places to use. DSR (Distributed Speech Recognition) based speech recognition also has this problem. Various noise cancelling algorithms are applied to solve this problem, but loss of spectrum and remaining noise by incorrect noise estimation at low SNR environments cause drop of recognition rate. This paper proposes methods for speech enhancement. This method uses MMSE-STSA for noise cancelling and ideal binary mask to compensate damaged spectrum. According to experiments at noisy environment (SNR 15 dB ~ 0 dB), the proposed methods showed better spectral results and recognition performance.
https://doi.org/10.7776/ASK.2010.29.7.468 인용 PDF KSCI

Voice Activity Detection Method Using Psycho-Acoustic Model Based on Speech Energy Maximization in Noisy Environments (잡음 환경에서 심리음향모델 기반 음성 에너지 최대화를 이용한 음성 검출 방법)

Choi, Gab-Keun;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.5
- /
- pp.447-453
- /
- 2009
This paper introduces the method for detect voices and exact end point at low SNR by maximizing voice energy. Conventional VAD (Voice Activity Detection) algorithm estimates noise level so it tends to detect the end point inaccurately. Moreover, because it uses relatively long analysis range for reflecting temporal change of noise, computing load too high for application. In this paper, the SEM-VAD (Speech Energy Maximization-Voice Activity Detection) method which uses psycho-acoustical bark scale filter banks to maximize voice energy within frames is introduced. Stable threshold values are obtained at various noise environments (SNR 15 dB, 10 dB, 5 dB, 0 dB). At the test for voice detection in car noisy environment, PHR (Pause Hit Rate) was 100%accurate at every noise environment, and FAR (False Alarm Rate) shows 0% at SNR15 dB and 10 dB, 5.6% at SNR5 dB and 9.5% at SNR0 dB.
https://doi.org/10.7776/ASK.2009.28.5.447 인용 PDF KSCI

Multidimensional Adaptive Noise Cancellation of Stress ECG Signal

Gautam, Alka;Lee, Young-Dong;Chung, Wan-Young
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2008.05a
- /
- pp.285-288
- /
- 2008
In ubiquitous computing environment the biological signal ECG (Electrocardiogram signal) is usually recorded with noise components. Adaptive interference (or noise) canceller do adaptive filtering of the noise reference input to maximally match and subtract out noise or interference from the primary (signal plus noise) input thereby adaptively eliminate unwanted interference from the ECG signal. Measured Stress ECG (or exercise ECG signal) signal have three major noisy component like baseline wander noise, motion artifact noise and EMG (Electro-mayo-cardiogram) noise. These noises are not only distorted signal but also root of incorrect diagnosis while ECG data are analyzed. Motion artifact and EMG noises behave like wide band spectrum signals, and they considerably do overlapping with the ECG spectrum. Here the multidimensional adaptive method used for filtering which is more effective to improve signal to noise ratio.
PDF

Search Result 390, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)