• Title/Summary/Keyword: MPEG USAC

Search Result 15, Processing Time 0.026 seconds

MPEG Audio New Standard: USAC Technology (MPEG 오디오 최신 표준: USAC 기술)

  • Lee, Tae-Jin;Kang, Kyeong-Ok;Kim, Whan-Woo
    • Journal of Broadcast Engineering
    • /
    • v.16 no.5
    • /
    • pp.693-704
    • /
    • 2011
  • As mobile devices become multi-functional, and converge into a single platform, there is a strong need for a codec that is able to provide consistent quality for speech and music contents. MPEG-D USAC standardization activities started at the 82nd MPEG meeting with a CfP and approved Study on DIS at the 96th MPEG meeting. MPEG-D USAC is converged technology of AMR-WB+ and HE-AAC V2. Specifically, USAC utilizes three core codecs (AAC, ACELP, and TCX) for low frequency regions, SBR for high frequency regions, the MPEG Surround for stereo information, and window transition technology for smoothing transition between various core coder. USAC can provide consistent sound quality for both speech and music contents and can be applied to various applications such as multi-media download to mobile devices, digital radio, mobile TV and audio books.

MPEG-D USAC: Unified Speech and Audio Coding Technology (MPEG-D USAC: 통합 음성 오디오 부호화 기술)

  • Lee, Tae-Jin;Kang, Kyeong-Ok;Kim, Whan-Woo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.589-598
    • /
    • 2009
  • As mobile devices become multi-functional, and converge into a single platform, there is a strong need for a codec that is able to provide consistent quality for speech and music content MPEG-D USAC standardization activities started at the 82nd MPEG meeting with a CfP and approved WD3 at the 88th MPEG meeting. MPEG-D USAC is converged technology of AMR-WB+ and HE-AAC V2. Specifically, USAC utilizes three core codecs (AAC ACELP and TCX) for low frequency regions, SBR for high frequency regions and the MPEG Surround tool for stereo information. USAC can provide consistent sound quality for both speech and music content and can be applied to various applications such as multi-media download to mobile device Digital radio Mobile TV and audio books.

Adaptive TCX Windowing Technology for Unified Structure MPEG-D USAC

  • Lee, Tae-Jin;Beack, Seung-Kwon;Kang, Kyeong-Ok;Kim, Whan-Woo
    • ETRI Journal
    • /
    • v.34 no.3
    • /
    • pp.474-477
    • /
    • 2012
  • The MPEG-D unified speech and audio coding (USAC) standardization process was initiated by MPEG to develop an audio codec that is able to provide consistent quality for mixed speech and music contents. The current USAC reference model structure consists of frequency domain (FD) and linear prediction domain (LPD) core modules and is controlled using a signal classifier tool. In this letter, we propose an LPD single-mode USAC structure using an adaptive widowing-based transform-coded excitation module. We tested our system using official test items for all mono-evaluation modes. The results of the experiment show that the objective and subjective performances of the proposed single-mode USAC system are better than those of the FD/LPD dual-mode USAC system.

Fixed-point Implementation of LPD Decoder in MPEG-D USAC (MPEG-D USAC : LPD 복호화기의 고정 소수점 알고리즘 구현)

  • Song, Eunwoo;Song, Jeongook;Kang, Hong-Goo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.07a
    • /
    • pp.254-256
    • /
    • 2012
  • 본 논문에서는 MPEG-D 오디오 서브그룹에서 진행 중인 Unified Speech and Audio Coding (USAC) 표준의 Linear Prediction Domain (LPD) 복호화기 모듈을 고정소수점 알고리즘으로 제안한다. USAC 부호화기는 두 개의 최신 음성-오디오 부호화기가 융합된 형태로, 음성 및 오디오 신호에 대하여 우수한 성능을 갖는 부호화기이다. USAC의 표준 완료와 본격적인 서비스화에 앞서서 USAC LPD 복호화기의 구조적인 특성을 분석하고, Digital Signal Processor (DSP)구현을 위한 LPD 복호화기의 고정소수점 알고리즘을 구축하는 동시에 모듈의 복잡도를 측정하고자 한다. 또한 고정소수점 알고리즘으로 구현된 LPD 복호화기와 기존의 부동소수점 복호화기의 성능을 비교하고, LPD 복호화기의 두 가지 부호화 모드에 따른 복잡도 이슈를 다루도록 한다.

  • PDF

Channel Expansion Technology in MPEG Audio (MPEG 오디오의 채널 확장 기술)

  • Pang, Hee-Suk
    • Journal of Broadcast Engineering
    • /
    • v.16 no.5
    • /
    • pp.714-721
    • /
    • 2011
  • MPEG audio uses the masking effect, high frequency component synthesis based on spectral band replication, and channel expansion based on parametric stereo for efficient compression of audio signals. In this paper, we present an overview of the state-of-the-art channel expansion technology in MPEG audio. We also present technical overviews and application examples to broadcasting services for HE-AAC v.2, MPEG Surround, spatial audio object coding (SAOC), and unified speech and audio coding (USAC) which are MPEG audio codecs based on the channel expansion technology.

A New MPEG Reference Model for Unified Speech and Audio Coding (통합 음성/오디오 부호화를 위한 새로운 MPEG 참조 모델)

  • Song, Jeong-Ook;Oh, Hyen-O;Kang, Hong-Goo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.74-80
    • /
    • 2010
  • Speech and audio codecs have been developed based on different type of coding technologies since they have different characteristics of signal and applications. In harmony with a convergence between broadcasting and telecommunication system, international organizations for standardization such as 3GPP and ISO/IEC MPEG have tried to compress and transmit multimedia signals using unified codecs. MPEG recently initiated an activity to standardize the USAC (Unified speech and audio coding). However, USAC RM (Reference model) software has been problematic since it has a complex hierarchy, many useless source codes and poor quality of the encoder. To solve these problems, this paper introduces a new RM software designed with an open source paradigm. It was presented at the MPEG meeting in April, 2010 and the source code was released in June.

Research on Open Source Encoding Technology for MPEG Unified Speech and Audio Coding (MPEG 통합 음성/오디오 코덱을 위한 오픈 소스 부호화 기술에 관한 연구)

  • Song, Jeongook;Lee, Joonil;Kang, Hong-Goo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.1
    • /
    • pp.86-96
    • /
    • 2013
  • Unified Speech and Audio Coding (USAC) is the speech/audio codec with the best quality, approved on Final Draft International Standard (FDIS) at MPEG meeting in 2011. Since MPEG conventionally standardizes only the decoder, it is not easy to study on the encoder technologies. Furthermore, Reference Model(RM) shows extremely poor performance. To solve these problems, the open source project(JAME) proposes the methods to make the improved performance of main encoder technologies in USAC. Especially, this paper introduces the encoder modules: the signal classifier for selective operation between two coders, the psychoacoustic model in frequency domain, and window transition technology. Finally, the results of verification test for FDIS and the performance of Common Encoder are appended.

A Performance Evaluation of the MPEG USAC with Variable Core-Band Down-Sampling Ratio (가변 핵심 대역 하향 표본화 비를 가진 MPEG USAC 성능 평가)

  • Lee, Jae Hwa;Kim, Rin Chul
    • Journal of Broadcast Engineering
    • /
    • v.18 no.1
    • /
    • pp.106-114
    • /
    • 2013
  • This paper deals with the effect of the internal sampling frequency and core band down sampling ratio on the overall performance of the MPEG USAC. Here, the internal sampling frequency is the sampling frequency of a signal actually coded. The core band down sampling ratio is the ratio of the width of the core band over that of the coded band. The performance was measured on 6 different test sound sources by the MUSHRA test with 10 subjects. The experiments showed that 1/3 or 1/4 core band down sampling ratio could yield the better performance than the conventional 1/2 ratio, especially at low rates.

Performance Evaluation of the MPEG USAC According to the Spectral Band Replication Bandwidth (Spectral Band Replication 대역폭에 따른 MPEG USAC 부호화 성능 평가)

  • An, Kyung-Jun;Jung, Yoo-Sun;Beack, Seung-Kwon;Kang, Kyeong-Ok;Kim, Rin-Chul
    • Journal of Broadcast Engineering
    • /
    • v.16 no.5
    • /
    • pp.705-713
    • /
    • 2011
  • This paper deals with the effect of SBR bandwidth on the overall performance of the MPEG USAC. Here, the SBR bandwidth is termed the frequency region covered by the SBR codec, and is specified by the bs_stop_freq, which is one of the SBR bitstream components. The performance of the USACs with 5 different SBR bandwidths are compared in a subjective manner using the MUSHRA test. In the comparison, the bit rate is confined to 14~24kbps and only the LPD unit is selected for the core codec. From the comparison, it is observed that the SBR bandwidth that stretches up to 18KHz or above gives the better performance than the others.

Speech/Mixed Content Signal Classification Based on GMM Using MFCC (MFCC를 이용한 GMM 기반의 음성/혼합 신호 분류)

  • Kim, Ji-Eun;Lee, In-Sung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.2
    • /
    • pp.185-192
    • /
    • 2013
  • In this paper, proposed to improve the performance of speech and mixed content signal classification using MFCC based on GMM probability model used for the MPEG USAC(Unified Speech and Audio Coding) standard. For effective pattern recognition, the Gaussian mixture model (GMM) probability model is used. For the optimal GMM parameter extraction, we use the expectation maximization (EM) algorithm. The proposed classification algorithm is divided into two significant parts. The first one extracts the optimal parameters for the GMM. The second distinguishes between speech and mixed content signals using MFCC feature parameters. The performance of the proposed classification algorithm shows better results compared to the conventionally implemented USAC scheme.