A New Wideband Speech/Audio Coder Interoperable with ITU-T G.729/G.729E

Kim, Kyung-Tae;Lee, Min-Ki;Youn, Dae-Hee;

대한전자공학회논문지SP (Journal of the Institute of Electronics Engineers of Korea SP)

제45권2호
/
Pages.81-89
/
2008
/
1229-6384(pISSN)

대한전자공학회 (The Institute of Electronics and Information Engineers)

ITU-T G.729/G.729E와 호환성을 갖는 광대역 음성/오디오 부호화기

A New Wideband Speech/Audio Coder Interoperable with ITU-T G.729/G.729E

김경태 (연세대학교 전기전자공학과) ;
이민기 (연세대학교 전기전자공학과) ;
윤대희 (연세대학교 전기전자공학과)

Kim, Kyung-Tae (School of Electrical & Electronic Engineering, Yonsei University) ;
Lee, Min-Ki (School of Electrical & Electronic Engineering, Yonsei University) ;
Youn, Dae-Hee (School of Electrical & Electronic Engineering, Yonsei University)

발행 : 2008.03.25

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

광대역 신호는 16 kHz로 표본화되어 50-7000 Hz로 밴드 제한된 신호를 말하며, 전화대역 음성 신호에 비해서 높은 자연성(naturalness)과 명료성(intelligibility)을 가진다. 이런 특징으로 광대역 부호화기는 화상회의, 디지털 AM 방송 및 고음질 음성통신 등에 사용될 수 있다. 본 논문에서는 가변대역 특징을 갖는 광대역 음성 오디오 부호화기를 제안하였다. 제안된 부호화기는 대역분한 구조를 가진다. 저주파 대역은 전화대역 음성 부호화기로 많이 사용되고 있는 8 kbit/s ITU-T G.729나 보다 높은 전송률로 오디오 신호까지 처리할 수 있는 11.8 kbit/s ITU-T G.729 Annex E로 부호화한다. 고주파 대역은 청각 모델을 기반으로 한 파라미터 부호화 방법으로 부호화한다. 제안된 고주파 대역 부호화는 감마톤 필터뱅크(gammatone filterbank)를 이용하여 입력신호를 임계대역으로 분할한 후, 각각의 임계대역 신호를 양자화한다. 저주파 대역 부호화기와 고주파 대역 부호화기는 서로 독립되어 있으므로, 복호화기에서는 채널 조건에 따라 전화대역 합성신호와 광대역 합성신호를 선택할 수 있는 특징이 있다. 성능 평가 결과, 제안된 부호화기는 낮은 전송률과 짧은 지연 시간으로 음성과 오디오 신호 모두에 대해 ITU-T G.722.1 24 kbit/s와 동등한 음질을 제공한다는 것을 확인하였다.

Wideband speech, characterized by a bandwidth of about 7 kHz (50-7000 Hz), provides a substantial quality improvement in terms of naturalness and intelligibility. Although higher data rates are required, it has extended its application to audio and video conferencing, high-quality multimedia communications in mobile links or packet-switched transmissions, and digital AM broadcasting. In this paper, we present a new bandwidth-scalable coder for wideband speech and audio signals. The proposed coder spits 8kHz signal bandwidth into two narrow bands, and different coding schemes are applied to each band. The lower-band signal is coded using the ITU-T G.729/G.729E coder, and the higher-band signal is compressed using a new algorithm based on the gammatone filter bank with an invertible auditory model. Due to the split-band architecture and completely independent coding schemes for each band, the output speech of the decoder can be selected to be a narrowband or wideband according to the channel condition. Subjective tests showed that, for wideband speech and audio signals, the proposed coder at 14.2/18 kbit/s produces superior quality to ITU-T 24 kbit/s G.722.1 with the shorter algorithmic delay.

키워드

G.729;

참고문헌

ITU-T Rec. G.722 '7 kHz Audio-coding within 64 kbit/s,' 1988
ITU-T Rec. G.722.1 'Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss,' 1999
ITU-T Rec. G.729.1, 'G.729 based Embedded Variable bit-rate coder: An 8-32 kbit/s scalable wideband coder bitstream interoperable with G.729,' 2006
ISO/IEC, 'Coding of Audiovisual Objects, Part 3: Audio, Subpart 3: CELP, Technical Report ISO /JTC 1/SC 29/N2203CELP,' 1998
A. McCree, T. Unno, A. Anandakumar, A. Bernard, E. Paksoy, 'An embedded adaptive multi-rate wideband speech coder,' Proc. of ICASSP, pp.761-764, Utah, 2001
V. Krishnan, V. Rajendran, A. Kandhadai and S. Manjunath, 'EVRC-WIDEBAND: The new 3GPP2 wideband vocoder standard,' Proc. of the ICASSP, pp.333-336, 2007
ITU-T Rec. G.729 'Coding of Speech at 8 kbit/s CS-ACELP Speech Coder,' 1996
ITU-T Rec. G.729 Annex E '11.8 kbit/s CS- ACELP speech coding algorithm,' 1998
L. Lin, W. H. Holmes, E. Ambikairajah, 'Auditory filter bank inversion,' Proc. of ISCAS, pp.537-540, 2001
E. Ambikairajah and J. Epps, L. Lin, 'Wideband speech and audio coding using Gammatone filter banks,' Proc. of ICASSP, pp.773-776, 2001
G. Kubin and W. B. Kleijn, 'On speech coding in a perceptual domain,' Proc. of ICASSP, pp. 205-208, 1999
A. V. Oppenheim and R. W. Schafer, 'Discrete -time signal processing:second edition' Prentice hall, 1998
E. Zwicker and H. Fastl, 'Psychoacoustics, Facts and Models, second updated edition,' Springer, 1998
K. B. Brandenburg, G. Stoll, 'ISO-MPEG-1 audio: A generic standard for coding of high-quality digital audio,' J. Audio Eng. Soc., vol. 42, no.10, Oct. 1994

대한전자공학회논문지SP (Journal of the Institute of Electronics Engineers of Korea SP)

ITU-T G.729/G.729E와 호환성을 갖는 광대역 음성/오디오 부호화기

A New Wideband Speech/Audio Coder Interoperable with ITU-T G.729/G.729E

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)