Research on Open Source Encoding Technology for MPEG Unified Speech and Audio Coding

Song, Jeongook;Lee, Joonil;Kang, Hong-Goo;

doi:10.5573/ieek.2013.50.1.086

Journal of the Institute of Electronics and Information Engineers (전자공학회논문지)

Volume 50 Issue 1
/
Pages.86-96
/
2013
/
2287-5026(pISSN)
/
2288-159X(eISSN)

The Institute of Electronics and Information Engineers (대한전자공학회)

DOI QR Code

Research on Open Source Encoding Technology for MPEG Unified Speech and Audio Coding

MPEG 통합 음성/오디오 코덱을 위한 오픈 소스 부호화 기술에 관한 연구

Song, Jeongook (School of Electrical and Electronic Engineering, Yonsei University) ;
Lee, Joonil (LG electronic Inc.) ;
Kang, Hong-Goo (School of Electrical and Electronic Engineering, Yonsei University)

송정욱 (연세대학교 전기전자공학과) ;
이준일 (LG 전자) ;
강홍구 (연세대학교 전기전자공학과)

Received : 2012.06.03
Published : 2013.01.25

https://doi.org/10.5573/ieek.2013.50.1.086 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Unified Speech and Audio Coding (USAC) is the speech/audio codec with the best quality, approved on Final Draft International Standard (FDIS) at MPEG meeting in 2011. Since MPEG conventionally standardizes only the decoder, it is not easy to study on the encoder technologies. Furthermore, Reference Model(RM) shows extremely poor performance. To solve these problems, the open source project(JAME) proposes the methods to make the improved performance of main encoder technologies in USAC. Especially, this paper introduces the encoder modules: the signal classifier for selective operation between two coders, the psychoacoustic model in frequency domain, and window transition technology. Finally, the results of verification test for FDIS and the performance of Common Encoder are appended.

통합 음성/오디오 부호화기 (Unified Speech and Audio Coding, USAC)는 2011년 MPEG에서 FDIS (Final Draft International Standard)를 승인받은 최고 성능의 통합 음성/오디오 부호화기이다. 전통적으로 MPEG에서는 복호화기 기술만 표준화하므로 인코더 기술에 대한 고찰이 쉽지 않을 뿐 아니라, 예제로 공개하는 인코더 (Reference Model, RM)의 경우에도 기본 아이디어만을 포함하고 있기 때문에 이를 사용할 경우 성능 저하가 매우 심각하다. 성능 열화는 매우 심각하다. 이러한 문제를 최소화하기 위해 오픈 소스 기반으로 진행되고 있는 프로젝트 JAME에서는 USAC에 적용된 핵심 인코더 기술의 성능을 최대화 할 수 있는 방법을 제안하고 있다. 본 논문에서는 입력 신호에 따라 두 코더가 선택적으로 동작되게 하는 신호 분류기와 심리 음향 모델을 기반으로 하는 주파수 부호화 기술, 그리고 전이 윈도우 기술 등의 주요 인코더 기술들에 대하여 소개한다. 또한 FDIS를 위한 verification test 결과와 Common Encoder의 성능 평가를 덧붙인다.

Keywords

References

ISO/IEC SC29 WG11 N12231, "ISO/IEC 23003-3/ FDIS, Unified Speech and Audio Coding", 97th MPEG Meeting, July, 2011.
ISO/IEC SC29 WG11 N9519, "Call for Proposals on Unified Speech and Audio Coding", 82nd MPEG Meeting, October, 2007.
J. Makinen, B. Bessette, S. Bruhn, P. Ojala, R. Salami, and A. Taleb, "AMR-WB+: a new audio coding standard for 3RD generation mobile audio services," in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), vol. 2, pp. II1109-II1112, March 2005.
K. Brandenburg and M. Bosi, "Overview of MPEG audio: current and future standards for low-bit-rate audio coding," Journal of the Audio Engineering Society, vol. 45, no. 1-2, pp.4-21, 1997.
M. Wolters et al, "A closer look into MPEG-4 High Efficiency AAC," 115th AES Convention, New York, USA, October 2003
M. Neuendorf, et al., "A novel scheme for low bitrate unified speech and audio coding-MPEG RM0," in Proceedings of the 126th AES Convention, Munich, Germany, May 2009.
ISO/IEC SC29 WG11 N12232, "USAC Verification Test Report", 97th MPEG Meeting, July, 2011.
ISO/IEC SC29 WG11 M17571, "Yonsei-LG Contribution to USAC Reference Software", 92nd MPEG Meeting, Dresden, Germany, April 2010
ISO/IEC SC29 WG11 M23882, "Report on the intermediate verification tests for USAC Common Encoder", 99th MPEG Meeting, Sanhose, USA, Feb. 2012.
Guillaume Fuchs, et al., "Mdct-based coder for highly adaptive speech and audio coding", 17th European Signal Processing Conference (EUSIPCO 2009), Glasgow, Scotland, August, 2009.
ISO/IEC SC29 WG11 M17020, "Proposal for unification of USAC windowing and frame transitions", 90th MPEG Meeting, Xian, China, Oct. 2009.
ISO/IEC SC29 WG11 M18470, "A new signal classifier for USAC reference encoder", 94th MPEG Meeting, Guangzhou, China, Oct. 2010.
ISO/IEC 11172-3:1993, Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s, Part 3: Audio.
3GPP, "General audio codec audio processing functions; Enhanced aacPlus general audio codec; Encoder specification; Advanced Audio Coding (AAC) part", 2004, 3GPP TS 26.403.
RECOMMENDATION ITU-R BS.1534-1, "Method for the subjective assessment of intermediate quality level of coding systems," 2001-2003.
ISO/IEC SC29 WG11 N12027, "Workplan for Verification Testing of USAC", 96th MPEG Meeting, Geneva, Switzerland, March, 2011.

Journal of the Institute of Electronics and Information Engineers (전자공학회논문지)

Research on Open Source Encoding Technology for MPEG Unified Speech and Audio Coding

MPEG 통합 음성/오디오 코덱을 위한 오픈 소스 부호화 기술에 관한 연구

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)