Search | Korea Science

Study of the Noise Processing to Technique Speech Recognition System (음성인식 시스템에서의 잡음 제거 개선에 관한 연구)

이창윤;이영훈
- Journal of the Korea Society of Computer and Information
- /
- v.7 no.2
- /
- pp.73-78
- /
- 2002
Recognition system of noise processing technique. A method combining SNR normalization with RAS is considered as a noise Processing and the performance of the speech recognition system can be improved using other noise processing technique. Experiment of recognition system is the internal organs that using a general digital signal processor(TMS320C31). Recognition word set is composed of 60 command words for of Rce environment and order of computer. Simulation is considered as a colored noise of general environment. The results of experiment showed that the recognition word set gives 94.61% of efficiency of recognition at maximum in case of the combination of SNR normalization and spectral subtraction.
PDF

Speech Recognition in the Noisy Environments using Hybrid Method of Spectral Subtraction and Noise Masking (스펙트럼 차감법과 잡음 마스킹의 hybrid 방식을 이용한 잡음환경에서의 음성인식)

권영욱
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06e
- /
- pp.343-346
- /
- 1998
잡음환경에서의 음성인식 성능향상을 위하여 본 논문에서는 스펙트럼 차감법 이후에 남아 있는 잔여 잡음으로 인한 mismatch를 극복하는 수단으로 기존의 스펙트럼 차감법에서의 flooring factor를 사용하는 대신에 target 잡음레벨을 이용하여 잡음 마스킹을 적용하는 스펙트럼 차감법과 잡음 마스킹의 hybrid 방식을 사용한다. 이 방법은 낮은 SNR에서 개선되지 않는 기존의 잡음 마스킹이 가지는 약점을 극복하고 동시에 스펙트럼 차감버에서의 잔여 잡음 문제를 완화시킬 수 있었다. 특히 시간/주파수 영역 smoothing을 적용함으로써 스펙트럼 차감법과 잡음 마스킹의 hybrid 방식의 적용 이후에도 여전히 남아 있는 일부 잡음을 추가적으로 감소시켰으며, 더욱 향상된 인식성능을 얻을 수 있었다.
PDF

Detection of Left Ventricular Contours Based on Elliptic Approximation and ML Estimate in Angiographic Images

Om, Kyong-Sik;Chung, Jae-Ho
- Journal of Electrical Engineering and information Science
- /
- v.1 no.2
- /
- pp.9-14
- /
- 1996
The goal of this research is to provide a practical algorithm for outlining the left ventricular cavity in digital subtraction angiography. The proposed algorithm is based on the elliptic approximation and ML (Maximum Likelihood) estimate, and it produces a good results regarding execution time, robustness against noise, accuracy, and range of position of ROI (Regions Of Interest).
PDF

A study on deep neural speech enhancement in drone noise environment (드론 소음 환경에서 심층 신경망 기반 음성 향상 기법 적용에 관한 연구)

Kim, Jimin;Jung, Jaehee;Yeo, Chaneun;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.3
- /
- pp.342-350
- /
- 2022
In this paper, actual drone noise samples are collected for speech processing in disaster environments to build noise-corrupted speech database, and speech enhancement performance is evaluated by applying spectrum subtraction and mask-based speech enhancement techniques. To improve the performance of VoiceFilter (VF), an existing deep neural network-based speech enhancement model, we apply the Self-Attention operation and use the estimated noise information as input to the Attention model. Compared to existing VF model techniques, the experimental results show 3.77%, 1.66% and 0.32% improvements for Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligence (STOI), respectively. When trained with a 75% mix of speech data with drone sounds collected from the Internet, the relative performance drop rates for SDR, PESQ, and STOI are 3.18%, 2.79% and 0.96%, respectively, compared to using only actual drone noise. This confirms that data similar to real data can be collected and effectively used for model training for speech enhancement in environments where real data is difficult to obtain.
https://doi.org/10.7776/ASK.2022.41.3.342 인용 PDF KSCI

Material Decomposition through Weighted Image Subtraction in Dual-energy Spectral Mammography with an Energy-resolved Photon-counting Detector using Monte Carlo Simulation (몬테카를로 시뮬레이션을 이용한 광자계수검출기 기반 이중에너지 스펙트럼 유방촬영에서 가중 영상 감산법을 통한 물질분리)

Eom, Jisoo;Kang, Sooncheol;Lee, Seungwan
- Journal of radiological science and technology
- /
- v.40 no.3
- /
- pp.443-451
- /
- 2017
Mammography is commonly used for screening early breast cancer. However, mammographic images, which depend on the physical properties of breast components, are limited to provide information about whether a lesion is malignant or benign. Although a dual-energy subtraction technique decomposes a certain material from a mixture, it increases radiation dose and degrades the accuracy of material decomposition. In this study, we simulated a breast phantom using attenuation characteristics, and we proposed a technique to enable the accurate material decomposition by applying weighting factors for the dual-energy mammography based on a photon-counting detector using a Monte Carlo simulation tool. We also evaluated the contrast and noise of simulated breast images for validating the proposed technique. As a result, the contrast for a malignant tumor in the dual-energy weighted subtraction technique was 0.98 and 1.06 times similar than those in the general mammography and dual-energy subtraction techniques, respectively. However the contrast between malignant and benign tumors dramatically increased 13.54 times due to the low contrast of a benign tumor. Therefore, the proposed technique can increase the material decomposition accuracy for malignant tumor and improve the diagnostic accuracy of mammography.
https://doi.org/10.17946/JRST.2017.40.3.12 인용 PDF KSCI

Preprocessing Technique for Improvement of Speech Recognition in a Car (차량에서의 음성인식율 향상을 위한 전처리 기법)

Kim, Hyun-Tae;Park, Jang-Sik
- The Journal of the Korea Contents Association
- /
- v.9 no.1
- /
- pp.139-146
- /
- 2009
This paper addresses a modified spectral subtraction schemes which is suitable to speech recognition under low signal-to-noise ratio (SNR) noisy environment such as the automatic speech recognition (ASR) system in car. The conventional spectral subtraction schemes rely on the SNR such that attenuation is imposed on that part of the spectrum that appears to have low SNR, and accentuation is made on that part of high SNR. However, such postulation is adequate for high SNR environment, it is grossly inadequate for low SNR scenarios such as that of car environment. Proposed methods focused specifically to low SNR noisy environment by using weighting function for enhancing speech dominant region in speech spectrum. Experimental results by using voice commands for car show the superior performance of the proposed method over conventional methods.
https://doi.org/10.5392/JKCA.2009.9.1.139 인용 PDF

Performance Improvements for Silence Feature Normalization Method by Using Filter Bank Energy Subtraction (필터 뱅크 에너지 차감을 이용한 묵음 특징 정규화 방법의 성능 향상)

Shen, Guanghu;Choi, Sook-Nam;Chung, Hyun-Yeol
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.35 no.7C
- /
- pp.604-610
- /
- 2010
In this paper we proposed FSFN (Filter bank sub-band energy subtraction based CLSFN) method to improve the recognition performance of the existing CLSFN (Cepstral distance and Log-energy based Silence Feature Normalization). The proposed FSFN reduces the energy of noise components in filter bank sub-band domain when extracting the features from speech data. This leads to extract the enhanced cepstral features and thus improves the accuracy of speech/silence classification using the enhanced cepstral features. Therefore, it can be expected to get improved performance comparing with the existing CLSFN. Experimental results conducted on Aurora 2.0 DB showed that our proposed FSFN method improves the averaged word accuracy of 2% comparing with the conventional CLSFN method, and FSFN combined with CMVN (Cepstral Mean and Variance Normalization) also showed the best recognition performance comparing with others.
PDF KSCI

Adaptive Noise Canceller for Speech Enhancement Using 2-D Binary Mask (2차원 이진 마스크를 이용한 적응형 음성향상 잡음 제거기)

Lee, Gihyoun;Lee, Jyung Hyun;Cho, Jin-Ho;Kim, Myoung Nam
- Journal of Korea Multimedia Society
- /
- v.19 no.7
- /
- pp.1127-1136
- /
- 2016
Speech enhancement algorithm plays an important role in numerous speech signal processing applications. Over the last few decades, many algorithms have been studied for speech enhancement. The algorithms are based on spectral subtraction, Wiener filter, and subspace method etc. They have good performance of speech enhancement, but the performance can be deteriorated in specific noises or low SNR environment. In this paper, a new speech enhancement algorithms are proposed based on adaptive noise canceller. And the proposed algorithm improved performance of adaptive noise cancelling using 2-D binary mask. From objective experimental index, it is confirmed that the proposed algorithm is useful and has better performance than recently proposed speech enhancement algorithms.
https://doi.org/10.9717/kmms.2016.19.7.1127 인용 PDF KSCI KPUBS HTML

Multiple Camera-based Person Correspondence using Color Distribution and Context Information of Human Body (색상 분포 및 인체의 상황정보를 활용한 다중카메라 기반의 사람 대응)

Chae, Hyun-Uk;Seo, Dong-Wook;Kang, Suk-Ju;Jo, Kang-Hyun
- Journal of Institute of Control, Robotics and Systems
- /
- v.15 no.9
- /
- pp.939-945
- /
- 2009
In this paper, we proposed a method which corresponds people under the structured spaces with multiple cameras. The correspondence takes an important role for using multiple camera system. For solving this correspondence, the proposed method consists of three main steps. Firstly, moving objects are detected by background subtraction using a multiple background model. The temporal difference is simultaneously used to reduce a noise in the temporal change. When more than two people are detected, those detected regions are divided into each label to represent an individual person. Secondly, the detected region is segmented as features for correspondence by a criterion with the color distribution and context information of human body. The segmented region is represented as a set of blobs. Each blob is described as Gaussian probability distribution, i.e., a person model is generated from the blobs as a Gaussian Mixture Model (GMM). Finally, a GMM of each person from a camera is matched with the model of other people from different cameras by maximum likelihood. From those results, we identify a same person in different view. The experiment was performed according to three scenarios and verified the performance in qualitative and quantitative results.
https://doi.org/10.5302/J.ICROS.2009.15.9.939 인용 PDF KSCI

A Study on the Improvement of Isolated Word Recognition for Telephone Speech (전화음성의 격리단어인식 개선에 관한 연구)

Do, Sam-Joo;Un, Chong-Kwan
- The Journal of the Acoustical Society of Korea
- /
- v.9 no.4
- /
- pp.66-76
- /
- 1990
In this work, the effect of noise and distortion of a telephone channel on the speech recognition is studied, and methods to improve the recognition rate are proposed. Computer simulation is done using the 100-word test data whichwere made by pronouncing ten times 100-phonetically balanced Korean isolated words in a speaker dependent mode. First, a spectral subtraction method is suggested to improve the noisy speech recognition. Then, the effect of bandwidth limiting and channel distortion is studied. It has been found that bandwidth limiting and amplitude distortion lower the recognition rate significantly, but phase distortion affects little. To reduce the channel effect, we modify the reference pattern according to some training data. When both channel noise and distortion exist, the recognition rate without the proposed method is merely 7.7~26.4%, but the recognition rate with the proposed method is drastically increased to 76.2~92.3%.
PDF

Search Result 155, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)