• 제목/요약/키워드: Recognition Enhancement

검색결과 361건 처리시간 0.03초

잡음음성인식을 위한 음성개선 방식들의 성능 비교 (Performance Comparison of the Speech Enhancement Methods for Noisy Speech Recognition)

  • 정용주
    • 말소리와 음성과학
    • /
    • 제1권2호
    • /
    • pp.9-14
    • /
    • 2009
  • Speech enhancement methods can be generally classified into a few categories and they have been usually compared with each other in terms of speech quality. For the successful use of speech enhancement methods in speech recognition systems, performance comparisons in terms of speech recognition accuracy are necessary. In this paper, we compared the speech recognition performance of some of the representative speech enhancement algorithms which are popularly cited in the literature and used widely. We also compared the performance of speech enhancement methods with other noise robust speech recognition methods like PMC to verify the usefulness of speech enhancement approaches in noise robust speech recognition systems.

  • PDF

독립 성분 분석과 스펙트럼 향상에 의한 잡음 환경에서의 음성인식 (Speech Recognition in Noise Environment by Independent Component Analysis and Spectral Enhancement)

  • 최승호
    • 대한음성학회지:말소리
    • /
    • 제48호
    • /
    • pp.81-91
    • /
    • 2003
  • In this paper, we propose a speech recognition method based on independent component analysis (ICA) and spectral enhancement techniques. While ICA tris to separate speech signal from noisy speech using multiple channels, some noise remains by its algorithmic limitations. Spectral enhancement techniques can compensate for lack of ICA's signal separation ability. From the speech recognition experiments with instantaneous and convolved mixing environments, we show that the proposed approach gives much improved recognition accuracies than conventional methods.

  • PDF

자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석 (Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition)

  • 송명석;이창헌;이석필;강홍구
    • The Journal of the Acoustical Society of Korea
    • /
    • 제29권2E호
    • /
    • pp.86-99
    • /
    • 2010
  • This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.

음질 개선을 통한 음성의 인식 (Speech Recognition through Speech Enhancement)

  • 조준희;이기성
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2003년도 학술회의 논문집 정보 및 제어부문 B
    • /
    • pp.511-514
    • /
    • 2003
  • The human being uses speech signals to exchange information. When background noise is present, speech recognizers experience performance degradations. Speech recognition through speech enhancement in the noisy environment was studied. Histogram method as a reliable noise estimation approach for spectral subtraction was introduced using MFCC method. The experiment results show the effectiveness of the proposed algorithm.

  • PDF

음질향상 기법과 모델보상 방식을 결합한 강인한 음성인식 방식 (A Robust Speech Recognition Method Combining the Model Compensation Method with the Speech Enhancement Algorithm)

  • 김희근;정용주;배건성
    • 음성과학
    • /
    • 제14권2호
    • /
    • pp.115-126
    • /
    • 2007
  • There have been many research efforts to improve the performance of the speech recognizer in noisy conditions. Among them, the model compensation method and the speech enhancement approach have been used widely. In this paper, we propose to combine the two different approaches to further enhance the recognition rates in the noisy speech recognition. For the speech enhancement, the minimum mean square error-short time spectral amplitude (MMSE-STSA) has been adopted and the parallel model combination (PMC) and Jacobian adaptation (JA) have been used as the model compensation approaches. From the experimental results, we could find that the hybrid approach that applies the model compensation methods to the enhanced speech produce better results than just using only one of the two approaches.

  • PDF

MMSE-STSA 기반의 음성개선 기법에서 잡음 및 신호 전력 추정에 사용되는 파라미터 값의 변화에 따른 잡음음성의 인식성능 분석 (Performance Analysis of Noisy Speech Recognition Depending on Parameters for Noise and Signal Power Estimation in MMSE-STSA Based Speech Enhancement)

  • 박철호;배건성
    • 대한음성학회지:말소리
    • /
    • 제57호
    • /
    • pp.153-164
    • /
    • 2006
  • The MMSE-STSA based speech enhancement algorithm is widely used as a preprocessing for noise robust speech recognition. It weighs the gain of each spectral bin of the noisy speech using the estimate of noise and signal power spectrum. In this paper, we investigate the influence of parameters used to estimate the speech signal and noise power in MMSE-STSA upon the recognition performance of noisy speech. For experiments, we use the Aurora2 DB which contains noisy speech with subway, babble, car, and exhibition noises. The HTK-based continuous HMM system is constructed for recognition experiments. Experimental results are presented and discussed with our findings.

  • PDF

이기종 음성 인식 시스템에 독립적으로 적용 가능한 특징 보상 기반의 음성 향상 기법 (Speech Enhancement Based on Feature Compensation for Independently Applying to Different Types of Speech Recognition Systems)

  • 김우일
    • 한국정보통신학회논문지
    • /
    • 제18권10호
    • /
    • pp.2367-2374
    • /
    • 2014
  • 본 논문에서는 이기종 음성 인식 시스템에 독립적으로 적용할 수 있는 음성 향상 기법을 제안한다. 잡음 환경 음성 인식에 효과적인 것으로 알려져 있는 특징 보상 기법이 효과적으로 적용되기 위해서는 특징 추출 기법와 음향 모델이 음성 인식 시스템과 일치해야 한다. 상용화된 음성 인식 시스템에 부가적으로 전처리 기법을 적용하는 상황과 같이, 음성 인식 시스템에 대한 정보가 알려져 있지 않은 상황에서는 기존의 특징 보상 기법을 적용하기가 어렵다. 본 논문에서는 기존의 PCGMM 기반의 특징 보상 기법에서 얻어지는 이득을 이용하는 음성 향상 기술을 제안한다. 실험 결과에서는 본 논문에서 제안하는 기법이 미지의 (Unknown) 음성 인식 시스템 적용 환경에서 기존의 전처리 기법에 비해 다양한 잡음 및 SNR 조건에서 월등한 인식 성능을 나타내는 것을 확인한다.

Representative Batch Normalization for Scene Text Recognition

  • Sun, Yajie;Cao, Xiaoling;Sun, Yingying
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권7호
    • /
    • pp.2390-2406
    • /
    • 2022
  • Scene text recognition has important application value and attracted the interest of plenty of researchers. At present, many methods have achieved good results, but most of the existing approaches attempt to improve the performance of scene text recognition from the image level. They have a good effect on reading regular scene texts. However, there are still many obstacles to recognizing text on low-quality images such as curved, occlusion, and blur. This exacerbates the difficulty of feature extraction because the image quality is uneven. In addition, the results of model testing are highly dependent on training data, so there is still room for improvement in scene text recognition methods. In this work, we present a natural scene text recognizer to improve the recognition performance from the feature level, which contains feature representation and feature enhancement. In terms of feature representation, we propose an efficient feature extractor combined with Representative Batch Normalization and ResNet. It reduces the dependence of the model on training data and improves the feature representation ability of different instances. In terms of feature enhancement, we use a feature enhancement network to expand the receptive field of feature maps, so that feature maps contain rich feature information. Enhanced feature representation capability helps to improve the recognition performance of the model. We conducted experiments on 7 benchmarks, which shows that this method is highly competitive in recognizing both regular and irregular texts. The method achieved top1 recognition accuracy on four benchmarks of IC03, IC13, IC15, and SVTP.

잡음환경에서 음성인식 성능향상을 위한 바이너리 마스크를 이용한 스펙트럼 향상 방법 (Method for Spectral Enhancement by Binary Mask for Speech Recognition Enhancement Under Noise Environment)

  • 최갑근;김순협
    • 한국음향학회지
    • /
    • 제29권7호
    • /
    • pp.468-474
    • /
    • 2010
  • 음성인식의 실용화에 가장 저해되는 요소는 배경잡음과 채널잡음에 의한 왜곡이다. 일반적으로 배경잡음은 음성인식 시스템의 성능을 저하시키고 이로 인해 사용 장소의 제약을 받게 한다. DSR (Distributed Speech Recognition) 기반의 음성인식 역시 이와 같은 문제로 성능 향상에 어려움을 겪고 있다. 이러한 문제를 해결하기 위해 다양한 잡음제거 알고리듬이 사용되고 있으나 낮은 SNR환경에서 부정확한 잡음추정으로 발생하는 스펙트럼 손상과 잔존 잡음은 음성인식기의 인식환경과 학습 환경의 불일치를 만들게 되어 인식률을 저하시키는 원인이 된다. 본 논문에서는 이와 같은 문제를 해결하기 위해 잡음제거 알고리듬으로 MMSE-STSA 방법을 사용하였고 손상된 스펙트럼을 보상하기 위해 Ideal Binary Mask를 이용하였다. 잡음환경 (SNR 15 ~ 0 dB)에 따른 실험결과 제안된 방법을 사용했을 때 향상된 스펙트럼을 얻을 수 있었고 향상된 인식성능을 확인했다.

The Effect of the Speech Enhancement Algorithm for Sensorineural Hearing Impaired Listeners

  • Kim, Dong-Wook;Lee, Young-Woo;Lee, Jong-Shill;Chee, Young-Joon;Lee, Sang-Min;Kim, In-Young;Kim, Sun-I.
    • 대한의용생체공학회:의공학회지
    • /
    • 제28권6호
    • /
    • pp.732-743
    • /
    • 2007
  • Background noise is one of the major complaints of not only hearing impaired persons but also normal listeners. This paper describes the results of two experiments in which speech recognition performance was determined for listeners with normal hearing and sensorineural hearing loss in noise environment. First, we compared speech enhancement algorithms by evaluation speech recognition ability in various speech-to-noise ratios and types of noise. Next, speech enhancement algorithms by reducing background noise were presented and evaluated to improve speech intelligibility for sensorineural hearing impairment listeners. We tested three noise reduction methods using single-microphone, such as spectrum subtraction and companding, Wiener filter method, and maximum likelihood envelop estimation. Their responses in background noise were investigated and compared with those by the speech enhancement algorithm that presented in this paper. The methods improved speech recognition test score for the sensorineural hearing impaired listeners, but not for normal listeners. The results suggest the speech enhancement algorithm with the loudness compression can improve speech intelligibility for listeners with sensorineural hearing loss.