• Title/Summary/Keyword: 음성 강조 알고리즘

Search Result 24, Processing Time 0.026 seconds

Nonlinear Speech Enhancement Method for Reducing the Amount of Speech Distortion According to Speech Statistics Model (음성 통계 모형에 따른 음성 왜곡량 감소를 위한 비선형 음성강조법)

  • Choi, Jae-Seung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.3
    • /
    • pp.465-470
    • /
    • 2021
  • A robust speech recognition technology is required that does not degrade the performance of speech recognition and the quality of the speech when speech recognition is performed in an actual environment of the speech mixed with noise. With the development of such speech recognition technology, it is necessary to develop an application that achieves stable and high speech recognition rate even in a noisy environment similar to the human speech spectrum. Therefore, this paper proposes a speech enhancement algorithm that processes a noise suppression based on the MMSA-STSA estimation algorithm, which is a short-time spectral amplitude method based on the error of the least mean square. This algorithm is an effective nonlinear speech enhancement algorithm based on a single channel input and has high noise suppression performance. Moreover this algorithm is a technique that reduces the amount of distortion of the speech based on the statistical model of the speech. In this experiment, in order to verify the effectiveness of the MMSA-STSA estimation algorithm, the effectiveness of the proposed algorithm is verified by comparing the input speech waveform and the output speech waveform.

Noisy Speech Enhancement by Restoration of DFT Components Using Neural Network (신경회로망을 이용한 DFT 성분 복원에 의한 음성강조)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.5
    • /
    • pp.1078-1084
    • /
    • 2010
  • This paper presents a speech enhancement system which restores the amplitude components and phase components by discrete Fourier transform (DFT), using neural network training by back-propagation algorithm. First, a neural network is trained using DFT amplitude components and phase components of noisy speech signal, then the proposed system enhances speech signals that are degraded by white noise using a neural network. Experimental results demonstrate that speech signals degraded by white noise are enhanced by the proposed system using the neural network, whose inputs are DFT amplitude components and phase components. Based on measuring spectral distortion measurement, experiments confirm that the proposed system is effective for white noise.

Speech Enhancement System by Discrete Fourier Transform Using Back-propagation Algorithm (오차역전파알고리즘을 사용한 이산푸리에변환에 의한 음성강조 시스템)

  • Choi, Jae-Seung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.254-257
    • /
    • 2010
  • 본 논문에서는 신경회로망을 사용하여 이산푸리에변환에 의한 진폭성분과 위상성분을 복원하는 음성강조 시스템을 제안한다. 본 시스템은 신경회로망이 잡음이 부가된 음성신호의 이산푸리에변환의 진폭성분과 위상성분을 사용하여 학습된 후, 제안한 시스템은 배경잡음에 의하여 열화된 잡음이 부가된 음성신호를 강조한다. 배경잡음에 의하여 열화된 음성신호는 신경회로망을 사용하여 제안된 시스템에 의하여 강조되는 것을 실험결과로 증명하며, 제안한 시스템이 스펙트럼 왜곡율의 평가법을 사용하여 배경잡음에 의하여 열화된 음성신호에 대하여 효과적인 것을 실험으로 확인한다.

  • PDF

Noise Suppression of Speech Signal using TDNN for each Frequency Band (주파수대역별 TDNN을 이용한 음성신호의 잡음억제)

  • Choi, Jae Seung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.05a
    • /
    • pp.341-344
    • /
    • 2009
  • 본 논문에서는 신경회로망(Neural network)에 시간구조를 도입한 시간지연 신경회로망(Time-delay Neural Network: TDNN)을 사용하여 잡음을 포함한 음성신호로부터 잡음을 제거함으로써 음성을 강조하는 것을 목적으로 한다. 본 논문에서는 먼저 각 프레임의 FFT 진폭성분들을 유성음 구간과 무성음 구간으로 검출한 후, 무성음 구간에 대해서는 각 프레임에서 이동평균을 취하여 음성을 강조한다. 유성음 구간에 대해서는 각 프레임의 FFT 진폭성분들을 저역, 중역 및 고역으로 각각 분리한 후에 각 대역의 FFT 진폭성분들을 저역용 TDNN, 중역용 TDNN, 그리고 고역용 TDNN의 입력으로 하여 각 TDNN에 학습시킴으로써 최종 FFT 진폭성분들을 구한다. 본 실험에서는 Aurora2 데이터베이스를 사용하여 FFT의 진폭성분을 복원하는 잡음제거의 알고리즘을 사용하여 여러 잡음에 대해서 본 알고리즘의 유효성을 실험적으로 확인한다.

  • PDF

Acoustic Masking Effect That Can Be Occurred by Speech Contrast Enhancement in Hearing Aids (보청기에서 음성 대비 강조에 의해 발생할 수 있는 마스킹 현상)

  • Jeon, Y.Y.;Yang, D.G.;Bang, D.H.;Kil, S.K.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.1 no.1
    • /
    • pp.21-28
    • /
    • 2007
  • In most of hearing aids, amplification algorithms are used to compensate hearing loss, noise and feedback reduction algorithms are used and to increase the perception of speeches contrast enhancement algorithms are used. However, acoustic masking effect is occurred between formants if contrast is enhanced excessively. To confirm the masking effect in speeches, the experiment are composed of 6 tests; test pure tone test, speech reception test, word recognition test, pure tone masking test, formant pure tone masking test and speech masking test, and for objective evaluation, LLR is introduced. As a result of normal hearing subjects and hearing impaired subjects, more making is occurred in hearing impaired subjects than normal hearing subjects when using pure tone, and in the speech masking test, speech reception is also lower in hearing impaired subjects than in normal hearing subjects. This means that acoustic masking effect rather than distortion influences speech perception. So it is required to check the characteristics of masking effect before wearing a hearing aid and to apply this characteristics to fitting curve.

  • PDF

Segmental Corrective Training for HMM Parameter Estimation in Speech Recognition (음성인식 시스템의 HMM 파라메터 추정을 위한 분절단위 교정 학습)

  • 김회린;이황수
    • The Journal of the Acoustical Society of Korea
    • /
    • v.12 no.2E
    • /
    • pp.5-11
    • /
    • 1993
  • 본 논문에서 HMM 파라메터 추정을 위해 분절단위 정보를 이용하는 수정된 교정학습방법을 제안한다. 수정된 교정학습방법은 기존의 교정학습 방법에서 사용하는 전향·후향 알고리즘 대신에 분절단위 K-means 알고리즘을 사용하여 HMM 파라메터를 교정한다. 이 방식은 분절단위 K-means 알고리즘이 음성신호내의 공통의 통계적 특성을 가지는 상태단위 정보를 강조한다는 사실을 이용하였다. 화자종속 음소 및 단어인식 실험에서 제안된 알고리즘이 기존의 교정학습 방법보다 적은 계산량으로도 향상된 인식률을 보여주었다. 이것은 HMM 교정학습에서 상태다누이 정보가 중요함을 보여준다.

  • PDF

Noise Reduction Algorithm in Speech by Wiener Filter (위너필터에 의한 음성 중의 잡음제거 알고리즘)

  • Choi, Jae-Seung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.9
    • /
    • pp.1293-1298
    • /
    • 2013
  • This paper proposes a noise reduction algorithm using Wiener filter to remove the noise components from the noisy speech in order to improve the speech signal. The proposed algorithm first removes the noise spectrums of white noise from the noisy signal based on the noise reshaping and reduction method at each frame. And this algorithm enhances the speech signal using Wiener filter based on linear predictive coding analysis. In this experiment, experimental results of the proposed algorithm demonstrate using the speech and noise data by Japanese male speaker. Based on measuring the spectral distortion (SD) measure, experiments confirm that the proposed algorithm is effective for the speech by contaminated white noise. From the experiments, the maximum improvement in the output SD values was 4.94 dB better for white noise compared with former Wiener filter.

A study on the algorithm for speech recognition (음성인식을 위한 알고리즘에 관한 연구)

  • Kim, Sun-Chul;Lee, Jung-Woo;Cho, Kyu-Ok;Park, Jae-Gyun;Oh, Yong Taek
    • Proceedings of the KIEE Conference
    • /
    • 2008.07a
    • /
    • pp.2255-2256
    • /
    • 2008
  • 음성인식 시스템을 설계함에 있어서는 대표적으로 사람의 성도 특성을 모방한 LPC(Linear Predict Cording)방식과 청각 특성을 고려한 MFCC(Mel-Frequency Cepstral Coefficients)방식이 있다. 본 논문에서는 MFCC를 통해 특징파라미터를 추출하고 해당 영역에서의 수행된 작업을 매틀랩 알고리즘을 이용하여 그래프로 시현하였다. MFCC 방식의 추출과정은 최초의 음성신호로부터 전처리과정을 통해 아날로그 신호를 디지털 신호로 변환하고, 잡음부분을 최소화하며, 음성 부분을 강조한다. 이 신호는 다시 Windowing을 통해 음성의 불연속을 제거해 주고, FFT를 통해 시간의 영역을 주파수의 영역으로 변환한다. 이 변환된 신호는 Filter Bank를 거쳐 다수의 복잡한 신호를 몇 개의 간단한 신호로 간소화 할 수 있으며, 마지막으로 Mel-cepstrum을 통해 최종적으로 특징 파라미터를 얻고자 하였다.

  • PDF

Noise Suppression Algorithm using Neural Network based Amplitude and Phase Spectrum (진폭 및 위상스펙트럼이 도입된 신경회로망에 의한 잡음억제 알고리즘)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.4
    • /
    • pp.652-657
    • /
    • 2009
  • This paper proposes an adaptive noise suppression system based on human auditory model to enhance speech signal that is degraded by various background noises. The proposed system detects voiced, unvoiced and silence sections for each frame and implements an adaptive auditory process, then reduces the noise speech signal using a neural network including amplitude component and phase component. Based on measuring signal-to-noise ratios, experiments confirm that the proposed system is effective for speech signal that is degraded by various noises.

A Study on Numeral Speech Recognition Using Integration of Speech and Visual Parameters under Noisy Environments (잡음환경에서 음성-영상 정보의 통합 처리를 사용한 숫자음 인식에 관한 연구)

  • Lee, Sang-Won;Park, In-Jung
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.38 no.3
    • /
    • pp.61-67
    • /
    • 2001
  • In this paper, a method that apply LP algorithm to image for speech recognition is suggested, using both speech and image information for recogniton of korean numeral speech. The input speech signal is pre-emphasized with parameter value 0.95, analyzed for B th LP coefficients using Hamming window, autocorrelation and Levinson-Durbin algorithm. Also, a gray image signal is analyzed for 2-dimensional LP coefficients using autocorrelation and Levinson-Durbin algorithm like speech. These parameters are used for input parameters of neural network using back-propagation algorithm. The recognition experiment was carried out at each noise level, three numeral speechs, '3','5', and '9' were enhanced. Thus, in case of recognizing speech with 2-dimensional LP parameters, it results in a high recognition rate, a low parameter size, and a simple algorithm with no additional feature extraction algorithm.

  • PDF