• Title/Summary/Keyword: spectral envelope

Search Result 66, Processing Time 0.025 seconds

Speech Recognition Using Noise Processing in Spectral Dimension (스펙트럴 차원의 잡음처리를 이용한 음성인식)

  • Lee, Gwang-seok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.738-741
    • /
    • 2009
  • This research is concerned for improving the result of speech recognition under the noisy speech. We knew that spectral subtraction and recovery of valleys in spectral envelope obtained from noisy speech are more effective for the improvement of the recognition. In this research, the averaged spectral envelope obtained from vowel spectrums are used for the emphasis of valleys. The vocalic spectral information at lower frequency range is emphasized and the spectrum obtained from consonants is not changed. In simulation, the emphasis coefficients are varied on cepstral domain. This method is used for the recognition of noisy digits and is improved.

  • PDF

Enhanced Spectral Envelope Coding Scheme Using Inter-frame Correlation for G.729.1 (G.729.1 코더에서 프레임 간의 상호상관 관계를 이용한 개선된 스펙트럼 포락 코딩 방법)

  • Cho, Keun-Seok;Sung, Jong-Mo;Hahn, Min-Soo;Kim, Young-Il;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.97-103
    • /
    • 2009
  • This paper describes a new algorithm for encoding spectral envelope in the time domain alias cancellation (TDAC) part of G.729.1. The spectral envelope and modified discrete cosine transform (MDCT) coefficients of the weighted code-excited linear predictive (CELP) coding error in lower-band and the higher-band input signal are encoded in the TDAC part. In order to reduce allocation bits for spectral envelope coding, a new algorithm using sub-band correlation between adjacent frames is proposed. In addition, to improve the quality of decoded signals, two bit allocation strategies using reduced bits from the proposed algorithm are proposed. The performance of the proposed algorithm is evaluated in terms of objective quality and bit reduction rates. Experimental results show that the proposed algorithm increases the quality of sounds significantly.

  • PDF

Binary Power Amplifier with 2-Bit Sigma-Delta Modulation Method for EER Transmitter

  • Lim, Ji-Youn;Cheon, Sang-Hoon;Kim, Kyeong-Hak;Hong, Song-Cheol;Kim, Dong-Wook
    • ETRI Journal
    • /
    • v.30 no.3
    • /
    • pp.377-382
    • /
    • 2008
  • A novel power amplifier for a polar transmitter is proposed to achieve better spectral performance for a wideband envelope signal. In the proposed scheme, 2-bit sigma-delta (${\Sigma}{\Delta}$) modulation of the envelope signal is introduced, and the power amplifier configuration is modified in a binary form to accommodate the 2-bit digitized envelope signals. The 2-bit ${\Sigma}{\Delta}$ modulator lowers the noise of the envelope signal by fine quantization and thus enhances the spectral property of the RF signal. The Ptolemy simulation results of the proposed structure show that the spectral noise is reduced by 10 dB in a full transmit band of the EDGE system. The dynamic range is also enhanced. Since the performance is improved without increasing the over-sampling ratio, this technique is best suited for wireless communication with high data rates.

  • PDF

A Study on Spectral Envelope Modification using Triangular Filter (삼각필터를 이용한 Spectral 포락변경에 관한 연구)

  • 최성은;김동현;홍광석
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2415-2418
    • /
    • 2003
  • In this paper, we present a new filter to adjust formant information. Spectral envelope in speech analysis shows information about characteristics of speech and formant information determines speech timbre. So, if formant position is adjusted, we can verify adjusted speech timbre. A presented filter is to adjust this formant. This filter is composed of triangular filters. Using this filter we could locate the formant frequency at target position.

  • PDF

Modified Generic Mode Coding Scheme for Enhanced Sound Quality of G.718 SWB (G.718 초광대역 코덱의 음질 향상을 위한 개선된 Generic Mode Coding 방법)

  • Cho, Keun-Seok;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.119-125
    • /
    • 2012
  • This paper describes a new algorithm for encoding spectral shape and envelope in the generic mode of G.718 super-wide band (SWB). In the G.718 SWB coder, generic mode coding and sinusoidal enhancement are used for the quantization of modified discrete cosine transform (MDCT)-based parameters in the high frequency band. In the generic mode, the high frequency band is divided into sub-bands and for every sub-band the most similar match with the selected similarity criteria is searched from the coded and envelope normalized wideband content. In order to improve the quantization scheme in high frequency region of speech/audio signals, the modified generic mode by the improvement of the generic mode in G.718 SWB is proposed. In the proposed generic mode, perceptual vector quantization of spectral envelopes and the resolution increase for spectral copy are used. The performance of the proposed algorithm is evaluated in terms of objective quality. Experimental results show that the proposed algorithm increases the quality of sounds significantly.

An Improved Detection Technique for Spread Spectrum Audio Watermarking with a Spectral Envelope Filter

  • Jung, Sa-Rah;Seok, Jong-Won;Hong, Jin-Woo
    • ETRI Journal
    • /
    • v.25 no.1
    • /
    • pp.52-54
    • /
    • 2003
  • We propose an improved algorithm for detecting audio watermarks based on a spread spectrum in the spectral domain. Since the energy of a watermark is much smaller than that of the cover audio data, pre-processing to reduce the effect of the cover data is needed to reliably extract watermarks. We introduce a spectral envelope filter as a pre-process that enhances detecting performance by filtering out the intrinsic spectral character of cover data. The proposed watermarking structure can be easily included in the compression system and can extract watermarks from partially decompressed spectral data. Our experimental results demonstrate that with a bit error rate of around 10 dB against general attacks, the proposed detecting scheme works better than detectors without the spectral filter.

  • PDF

Combinatorial continuous non-stationary critical excitation in M.D.O.F structures using multi-peak envelope functions

  • Ghasemi, S. Hooman;Ashtari, P.
    • Earthquakes and Structures
    • /
    • v.7 no.6
    • /
    • pp.895-908
    • /
    • 2014
  • The main objective of critical excitation methods is to reveal the worst possible response of structures. This goal is accomplished by considering the uncertainties of ground motion, which is subjected to the appropriate constraints, such as earthquake power and intensity limit. The concentration of this current study is on the theoretical optimization aspect, as is the case with the majority of conventional critical excitation methods. However, these previous studies on critical excitation lead to a discontinuous power spectral density (PSD). This paper introduces some critical excitations which contain proper continuity in frequency domain. The main idea for generating such continuous excitations stems from the combination of two continuous functions. On the other hand, in order to provide a non-stationary model, this paper attempts to present an appropriate envelope function, which unlike the previous envelope functions, can properly cover the natural earthquakes' accelerograms based on multi-peak conditions. Finally, the proposed method is developed into the multiple-degree-of-freedom (M.D.O.F) structures.

Artificial Bandwidth Extension Based on Harmonic Structure Extension and NMF (하모닉 구조 확장과 NMF 기반의 인공 대역 확장 기술)

  • Kim, Kijun;Park, Hochong
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.12
    • /
    • pp.197-204
    • /
    • 2013
  • In this paper, we propose a new method for artificial bandwidth extension of narrow-band signal in frequency domain. In the proposed method, a narrow-band signal is decomposed into excitation signal and spectral envelope, which are extended independently in frequency domain. The excitation signal is extended such that low-band harmonic structure is maintained in high band, and the spectral envelope is extended based on sub-band energy using NMF. Finally, the spectral phase is determined based on signal correlation between frames in time domain, resulting in the final wide-band signal. The subjective evaluation verified that the wide-band signal generated by the proposed method has a higher quality than the original narrow-band signal.

Formant Synthesis of Haegeum Sounds Using Cepstral Envelope (캡스트럼 포락선을 이용한 해금 소리의 포만트 합성)

  • Hong, Yeon-Woo;Cho, Sang-Jin;Kim, Jong-Myon;Chong, Ui-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.6
    • /
    • pp.526-533
    • /
    • 2009
  • This paper proposes a formant synthesis method of Haegeum sounds using cepstral envelope for spectral modeling. Spectral modeling synthesis (SMS) is a technique that models time-varying spectra as a combination of sinusoids (the "deterministic" part), and a time-varying filtered noise component (the "stochastic" part). SMS is appropriate for synthesizing sounds of string and wind instruments whose harmonics are evenly distributed over whole frequency band. Formants extracted from cepstral envelope are parameterized for synthesis of sinusoids. A resonator by Impulse Invariant Transform (IIT) is applied to synthesize sinusoids and the results are bandpass filtered to adjust magnitude. The noise is calculated by first generating the sinusoids with formant synthesis, subtracting them from the original sound, and then removing some harmonics remained. Linear interpolation is used to model noise. The synthesized sounds are made by summing sinusoids, which are shown to be similar to the original Haegeum sounds.

Voice transformation for HTS using correlation between fundamental frequency and vocal tract length (기본주파수와 성도길이의 상관관계를 이용한 HTS 음성합성기에서의 목소리 변환)

  • Yoo, Hyogeun;Kim, Younggwan;Suh, Youngjoo;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.41-47
    • /
    • 2017
  • The main advantage of the statistical parametric speech synthesis is its flexibility in changing voice characteristics. A personalized text-to-speech(TTS) system can be implemented by combining a speech synthesis system and a voice transformation system, and it is widely used in many application areas. It is known that the fundamental frequency and the spectral envelope of speech signal can be independently modified to convert the voice characteristics. Also it is important to maintain naturalness of the transformed speech. In this paper, a speech synthesis system based on Hidden Markov Model(HMM-based speech synthesis, HTS) using the STRAIGHT vocoder is constructed and voice transformation is conducted by modifying the fundamental frequency and spectral envelope. The fundamental frequency is transformed in a scaling method, and the spectral envelope is transformed through frequency warping method to control the speaker's vocal tract length. In particular, this study proposes a voice transformation method using the correlation between fundamental frequency and vocal tract length. Subjective evaluations were conducted to assess preference and mean opinion scores(MOS) for naturalness of synthetic speech. Experimental results showed that the proposed voice transformation method achieved higher preference than baseline systems while maintaining the naturalness of the speech quality.