• Title/Summary/Keyword: Noise speech data

Search Result 144, Processing Time 0.022 seconds

Eigenvoice Adaptation of Classification Model for Binary Mask Estimation (Eigenvoice를 이용한 이진 마스크 분류 모델 적응 방법)

  • Kim, Gibak
    • Journal of Broadcast Engineering
    • /
    • v.20 no.1
    • /
    • pp.164-170
    • /
    • 2015
  • This paper deals with the adaptation of classification model in the binary mask approach to suppress noise in the noisy environment. The binary mask estimation approach is known to improve speech intelligibility of noisy speech. However, the same type of noisy data for the test data should be included in the training data for building the classification model of binary mask estimation. The eigenvoice adaptation is applied to the noise-independent classification model and the adapted model is used as noise-dependent model. The results are reported in Hit rates and False alarm rates. The experimental results confirmed that the accuracy of classification is improved as the number of adaptation sentences increases.

Korean vowel recognition in noise using auditory model

  • Shim, Jae-Seong;Lee, Jae-Hyuk;Yoon, Tae-Sung;Beack, Seung-Hwa;Park, Sang-Hui
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1988.10b
    • /
    • pp.1037-1040
    • /
    • 1988
  • In this study, we performed the recognition test on Korean vowel using peripheral auditory model. In addition, for the purpose of objective comparision, the recognition test is performed by extracting LPC cepstrum coefficients from the same data. And the same speech data are mixed with the Guaussian white noise quantitatively, then we repeated the same test, too. So we verified that this auditory model has a adaptability on noise.

  • PDF

Variable Quad Rate ADPCM for Efficient Speech Transmission and Real Time Implementation on DSP (효율적인 음성신호의 전송을 위한 4배속 가변 변환율 ADPCM기법 및 DSP를 이용한 실시간 구현)

  • 한경호
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.18 no.1
    • /
    • pp.129-136
    • /
    • 2004
  • In this paper, we proposed quad variable rates ADPCM coding method for efficient speech transmission and real time porcessing is implemented on TMS320C6711-DSP. The modified ADPCM with four variable coding rates, 16[kbps], 24[kbps], 32[kbps] and 40[kbps] are used for speech window samples for good quality speech transmission at a small data bits and real time encoding and decoding is implemented using DSP. ZCR is used to identify the influence of the noise on the speech signal and to decide the rate change threshold. For noise superior signals, low coding rates are applied to minimize data bit and for noise inferior signals, high coding rates are applied to enhance the speech quality. In most speech telecommunications, silent period takes more than half of the signals, speech quality close to 40[kbps] can be obtained at comparabley low data bits and this is shown by simulation and experiments. TMS320C6711-DSK board has 128K flash memory and performance of 1333MIPS and has meets the requirements for real time implementation of proposed coding algorithm.

A study on Voice Recognition using Model Adaptation HMM for Mobile Environment (모델적응 HMM을 이용한 모바일환경에서의 음성인식에 관한 연구)

  • Ahn, Jong-Young;Kim, Sang-Bum;Kim, Su-Hoon;Hur, Kang-In
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.11 no.3
    • /
    • pp.175-179
    • /
    • 2011
  • In this paper, we propose the MA(Model Adaption) HMM that to use speech enhancement and feature compensation. Normally voice reference data is not consider for real noise data. This method is not to use estimated noise but we use real life environment noise data. And we applied this contaminated data for recognition reference model that suitable for noise environment. MAHMM is combined with surround noise when generating reference patten. We improved voice recognition rate at mobile environment to use MAHMM.

Measuring Correlation between Mental Fatigues and Speech Features (정신피로와 음성특징과의 상관관계 측정)

  • Kim, Jungin;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.3-8
    • /
    • 2014
  • This paper deals with how mental fatigue has an effect on human voice. For this a monotonous task to increase the feeling of the fatigue and a set of subjective questionnaire for rating the fatigue were designed. From the experiments the designed task was proven to be monotonous based on the results of the questionnaire responses. To investigate a statistical relationship between speech features extracted from the collected speech data and fatigue, the T test for two-related-samples was used. Statistical analysis shows that speech parameters deeply related to the fatigue are the first formant bandwidth, Jitter, H1-H2, cepstral peak prominence, and harmonics-to-noise ratio. According to the experimental results, it can be seen that voice is changed to be breathy as mental fatigue proceeds.

On a Multiband Nonuniform Samping Technique with a Gaussian Noise Codebook for Speech Coding (가우시안 코드북을 갖는 다중대역 비균일 음성 표본화법)

  • Chung, Hyung-Goue;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.6
    • /
    • pp.110-114
    • /
    • 1997
  • When applying the nonuniform sampling to noisy speech signal, the required data rate increases to be comparable to or more than that by uniform sampling such as PCM. To solve this problem, we have proposed the waveform coding method, multiband nonuniform waveform coding(MNWC), applying the nonuniform sampling to band-separated speech signal[7]. However, the speech quality is deteriorated when it is compared to the uniform sampling method, since the high band is simply modeled as a Gaussian noise with average level. In this paper, as a good method to overcome this drawback, the high band is modeled as one of 16 codewords having different center frequencies. By doing this, with maintaining high speech quality as MOS score of average 3.16, the proposed method achieves 1.5 times higher compression ratio than that of the conventional nonuniform sampling method(CNSM).

  • PDF

Development of English Speech Recognizer for Pronunciation Evaluation (발성 평가를 위한 영어 음성인식기의 개발)

  • Park Jeon Gue;Lee June-Jo;Kim Young-Chang;Hur Yongsoo;Rhee Seok-Chae;Lee Jong-Hyun
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.37-40
    • /
    • 2003
  • This paper presents the preliminary result of the automatic pronunciation scoring for non-native English speakers, and shows the developmental process for an English speech recognizer for the educational and evaluational purposes. The proposed speech recognizer, featuring two refined acoustic model sets, implements the noise-robust data compensation, phonetic alignment, highly reliable rejection, key-word and phrase detection, easy-to-use language modeling toolkit, etc., The developed speech recognizer achieves 0.725 as the average correlation between the human raters and the machine scores, based on the speech database YOUTH for training and K-SEC for test.

  • PDF

A Study on the Utilization of Computer Program for the Prediction of Rom Acoustics (실내음향 예측을 위한 컴퓨터 프로그램 이용에 관한 연구)

  • 김선우;최형욱;한명호
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 1997.10a
    • /
    • pp.250-255
    • /
    • 1997
  • The computer simulation and mock-up test are recently applied to the practical design for the room acoustics to predict and evaluate its characteristics. In this paper, the sound field properties predicted and evaluated by the computer simulation were compared to the measured data. Comparison and analysis between simulation data and measured data were performed for the Reverberation Time, Sound Pressure Level at the various measuring positions and frequencies and Definition, Early Decay Time, Speech Transmission Index.

  • PDF

Noise Cancellation System Based on Frequency Domain Adaptive Filter Using Modified DFT Pair

  • Nakanishi, Isao;Nakamura, Youichi;Itoh, Yoshio;Fukui, Yutaka
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.225-228
    • /
    • 2000
  • It is well known that a Frequency Domain Adaptive Filter (FDAF) converges faster than a Time Domain Adaptive Filter (TDAF) even when the input signal is colored such as a speech signal. We have proposed the FDAF using the Modified Discrete Fourier Transform Pair (MDFTP) and its realization and effectiveness has been confirmed through the computer simulations. In this paper, we apply the FDAF using the MDFTP to the noise cancellation system. The proposed system is based on the Adaptive Line Enhancer (ALE) and utilizes single microphone; therefore it is suitable for the portable electronic equipment. Moreover, we propose to utilize the MDFT for detecting of the pitch in the speech because the number of data points in the MDFT must be equal to the pitch to confirmed that the noise can be removed to near the level of SNR.

  • PDF

A study on the competitive learning algorithm for robust vector qantization to transmit speech signal (벡터 양자화를 위한 학습 알고리즘을 이용한 음성 전송 기술에 관한 연구)

  • Hong, Kang-You;Park, Sang-Hui
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.3150-3152
    • /
    • 1999
  • The efficient representation and encoding of signals with limited resources, e.g., finite storage capacity and restricted transmission bandwidth, is a fundamental problem in technical information processing systems. Typically under realistic circumstances, the encoding and communication of message has to deal with different sources of noise and disturbances. In this paper, I propose a unifying approach to data compression by robust vector quantization, which explicitly deals with channel noise, and random elimination of prototypes. The resulting algorithm is able to limit the detrimental effect of noise in a very general communication scenario. In this paper, based on the robust vector quantization I have an experiment about speech coding.

  • PDF