통합 검색 | Korea Science

스펙트럴 차원의 잡음처리를 이용한 음성인식 (Speech Recognition Using Noise Processing in Spectral Dimension)

이광석
- 한국정보통신학회:학술대회논문집
- /
- 한국해양정보통신학회 2009년도 추계학술대회
- /
- pp.738-741
- /
- 2009
본 연구는 잡음을 포함한 음성 환경에서의 음성인식을 개선방안에 관한 것이다. 우리는 음성인식에서 잡음 섞인 음성으로부터 얻은 스펙트럴 envelope에서 곡들의 스펙트럴 subtraction 및 복원이 보다 더 효과적임을 알 수 있었다. 본 연구에서, 평균화된 스펙트럴 envelope은 모음 스펙트럼으로부터 추출하여 곡들의 강조에 사용하였다. 낮은 주파수 영역에서의 모음 스펙트럴 정보는 강조되어지고 자음으로부터 얻은 스펙트럼은 변하지 않는다. 시뮬레이션으로 살펴보면, 강조계수는 켑스트럴 영역에서 변한다. 이 방법으로 잡음석인 숫자음성 인식에서 적용하였으며 인식결과가 개선됨을 알 수 있었다.
PDF

G.729.1 코더에서 프레임 간의 상호상관 관계를 이용한 개선된 스펙트럼 포락 코딩 방법 (Enhanced Spectral Envelope Coding Scheme Using Inter-frame Correlation for G.729.1)

조근석;성종모;한민수;김영일;정상배
- 말소리와 음성과학
- /
- 제1권4호
- /
- pp.97-103
- /
- 2009
This paper describes a new algorithm for encoding spectral envelope in the time domain alias cancellation (TDAC) part of G.729.1. The spectral envelope and modified discrete cosine transform (MDCT) coefficients of the weighted code-excited linear predictive (CELP) coding error in lower-band and the higher-band input signal are encoded in the TDAC part. In order to reduce allocation bits for spectral envelope coding, a new algorithm using sub-band correlation between adjacent frames is proposed. In addition, to improve the quality of decoded signals, two bit allocation strategies using reduced bits from the proposed algorithm are proposed. The performance of the proposed algorithm is evaluated in terms of objective quality and bit reduction rates. Experimental results show that the proposed algorithm increases the quality of sounds significantly.
PDF

Binary Power Amplifier with 2-Bit Sigma-Delta Modulation Method for EER Transmitter

Lim, Ji-Youn;Cheon, Sang-Hoon;Kim, Kyeong-Hak;Hong, Song-Cheol;Kim, Dong-Wook
- ETRI Journal
- /
- 제30권3호
- /
- pp.377-382
- /
- 2008
A novel power amplifier for a polar transmitter is proposed to achieve better spectral performance for a wideband envelope signal. In the proposed scheme, 2-bit sigma-delta (${\Sigma}{\Delta}$) modulation of the envelope signal is introduced, and the power amplifier configuration is modified in a binary form to accommodate the 2-bit digitized envelope signals. The 2-bit ${\Sigma}{\Delta}$ modulator lowers the noise of the envelope signal by fine quantization and thus enhances the spectral property of the RF signal. The Ptolemy simulation results of the proposed structure show that the spectral noise is reduced by 10 dB in a full transmit band of the EDGE system. The dynamic range is also enhanced. Since the performance is improved without increasing the over-sampling ratio, this technique is best suited for wireless communication with high data rates.
PDF

삼각필터를 이용한 Spectral 포락변경에 관한 연구 (A Study on Spectral Envelope Modification using Triangular Filter)

최성은;김동현;홍광석
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2003년도 하계종합학술대회 논문집 Ⅳ
- /
- pp.2415-2418
- /
- 2003
In this paper, we present a new filter to adjust formant information. Spectral envelope in speech analysis shows information about characteristics of speech and formant information determines speech timbre. So, if formant position is adjusted, we can verify adjusted speech timbre. A presented filter is to adjust this formant. This filter is composed of triangular filters. Using this filter we could locate the formant frequency at target position.
PDF

G.718 초광대역 코덱의 음질 향상을 위한 개선된 Generic Mode Coding 방법 (Modified Generic Mode Coding Scheme for Enhanced Sound Quality of G.718 SWB)

조근석;정상배
- 말소리와 음성과학
- /
- 제4권3호
- /
- pp.119-125
- /
- 2012
This paper describes a new algorithm for encoding spectral shape and envelope in the generic mode of G.718 super-wide band (SWB). In the G.718 SWB coder, generic mode coding and sinusoidal enhancement are used for the quantization of modified discrete cosine transform (MDCT)-based parameters in the high frequency band. In the generic mode, the high frequency band is divided into sub-bands and for every sub-band the most similar match with the selected similarity criteria is searched from the coded and envelope normalized wideband content. In order to improve the quantization scheme in high frequency region of speech/audio signals, the modified generic mode by the improvement of the generic mode in G.718 SWB is proposed. In the proposed generic mode, perceptual vector quantization of spectral envelopes and the resolution increase for spectral copy are used. The performance of the proposed algorithm is evaluated in terms of objective quality. Experimental results show that the proposed algorithm increases the quality of sounds significantly.
https://doi.org/10.13064/KSSS.2012.4.3.119 인용 PDF

An Improved Detection Technique for Spread Spectrum Audio Watermarking with a Spectral Envelope Filter

Jung, Sa-Rah;Seok, Jong-Won;Hong, Jin-Woo
- ETRI Journal
- /
- 제25권1호
- /
- pp.52-54
- /
- 2003
We propose an improved algorithm for detecting audio watermarks based on a spread spectrum in the spectral domain. Since the energy of a watermark is much smaller than that of the cover audio data, pre-processing to reduce the effect of the cover data is needed to reliably extract watermarks. We introduce a spectral envelope filter as a pre-process that enhances detecting performance by filtering out the intrinsic spectral character of cover data. The proposed watermarking structure can be easily included in the compression system and can extract watermarks from partially decompressed spectral data. Our experimental results demonstrate that with a bit error rate of around 10 dB against general attacks, the proposed detecting scheme works better than detectors without the spectral filter.
PDF

Combinatorial continuous non-stationary critical excitation in M.D.O.F structures using multi-peak envelope functions

Ghasemi, S. Hooman;Ashtari, P.
- Earthquakes and Structures
- /
- 제7권6호
- /
- pp.895-908
- /
- 2014
The main objective of critical excitation methods is to reveal the worst possible response of structures. This goal is accomplished by considering the uncertainties of ground motion, which is subjected to the appropriate constraints, such as earthquake power and intensity limit. The concentration of this current study is on the theoretical optimization aspect, as is the case with the majority of conventional critical excitation methods. However, these previous studies on critical excitation lead to a discontinuous power spectral density (PSD). This paper introduces some critical excitations which contain proper continuity in frequency domain. The main idea for generating such continuous excitations stems from the combination of two continuous functions. On the other hand, in order to provide a non-stationary model, this paper attempts to present an appropriate envelope function, which unlike the previous envelope functions, can properly cover the natural earthquakes' accelerograms based on multi-peak conditions. Finally, the proposed method is developed into the multiple-degree-of-freedom (M.D.O.F) structures.
https://doi.org/10.12989/eas.2014.7.6.895 인용 KSCI

하모닉 구조 확장과 NMF 기반의 인공 대역 확장 기술 (Artificial Bandwidth Extension Based on Harmonic Structure Extension and NMF)

김기준;박호종
- 전자공학회논문지
- /
- 제50권12호
- /
- pp.197-204
- /
- 2013
본 논문에서는 주파수 영역에서 협대역 신호를 광대역으로 확장하는 새로운 인공 대역 확장 기술을 제안한다. 제안한 기술은 협대역 신호를 여기 신호와 스펙트럼 포락선 성분으로 분리하고, 주파수 영역에서 각각 독립적인 방법으로 확장한다. 여기 신호는 저대역의 하모닉 구조가 고대역에서 유지되도록 확장하고, 스펙트럼 포락선은 부대역별 에너지를 기반으로 NMF방법으로 확장한다. 마지막으로 시간 축에서 프레임 사이의 상관관계를 기반으로 스펙트럼 위상을 결정하여 최종 광대역 신호를 생성한다. 주관적 청취 평가를 통하여 제안한 방법으로 대역 확장된 신호가 원 협대역 신호보다 음질이 향상된 것을 확인하였다.
https://doi.org/10.5573/ieek.2013.50.12.197 인용 PDF KSCI

캡스트럼 포락선을 이용한 해금 소리의 포만트 합성 (Formant Synthesis of Haegeum Sounds Using Cepstral Envelope)

홍연우;조상진;김종면;정의필
- 한국음향학회지
- /
- 제28권6호
- /
- pp.526-533
- /
- 2009
본 논문에서는 전통 현악기 해금의 스펙트럼 모델링을 위해 캡스트럼 포락선을 이용한 포만트 합성법을 제안한다. 스펙트럼 모델링은 입력 신호를 정현파 성분과 노이즈 성분의 합으로 해석하여 음을 합성하는 기술로 주기성이 있는 현악기나 관악기의 음 합성에 효과적이다. 캡스트럼 포락선의 포만트는 정현파 성분 합성을 위한 파라미터로 활용하였다. 정현파 성분을 합성하기 위해 기존의 가산합성 방식과는 달리 IIT (Impulse Invariant Transform)로 공명기를 설계하였으며 배음간 크기 보완을 위해 대역 통과 필터를 추가하였다. 원음과 합성된 정현파 성분의 차로 구해진 노이즈 성분에 포함된 일부 유효배음을 제거하면 완전한 노이즈 성분을 구할 수 있으며 선형 보간법 (linear interpolation)에 기초하여 그 주파수 특성을 파라미터화 하였다. 최종적으로 합성된 노이즈 성분과 정현파 성분을 더하여 해금 단위음을 합성하였고 합성음은 원음과 매우 유사하였다.
https://doi.org/10.7776/ASK.2009.28.6.526 인용 PDF KSCI

기본주파수와 성도길이의 상관관계를 이용한 HTS 음성합성기에서의 목소리 변환 (Voice transformation for HTS using correlation between fundamental frequency and vocal tract length)

유효근;김영관;서영주;김회린
- 말소리와 음성과학
- /
- 제9권1호
- /
- pp.41-47
- /
- 2017
The main advantage of the statistical parametric speech synthesis is its flexibility in changing voice characteristics. A personalized text-to-speech(TTS) system can be implemented by combining a speech synthesis system and a voice transformation system, and it is widely used in many application areas. It is known that the fundamental frequency and the spectral envelope of speech signal can be independently modified to convert the voice characteristics. Also it is important to maintain naturalness of the transformed speech. In this paper, a speech synthesis system based on Hidden Markov Model(HMM-based speech synthesis, HTS) using the STRAIGHT vocoder is constructed and voice transformation is conducted by modifying the fundamental frequency and spectral envelope. The fundamental frequency is transformed in a scaling method, and the spectral envelope is transformed through frequency warping method to control the speaker's vocal tract length. In particular, this study proposes a voice transformation method using the correlation between fundamental frequency and vocal tract length. Subjective evaluations were conducted to assess preference and mean opinion scores(MOS) for naturalness of synthetic speech. Experimental results showed that the proposed voice transformation method achieved higher preference than baseline systems while maintaining the naturalness of the speech quality.
https://doi.org/10.13064/KSSS.2017.9.1.041 인용 PDF KSCI

검색결과 66건 처리시간 0.024초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)