DOI QR코드

DOI QR Code

Audio Mixer Algorithm for Enhancing Speech Quality of Multi-party Audio Telephony

다자간 음성통화 품질 향상을 위한 오디오 믹서 알고리즘

  • 류상현 (광운대학교 전파공학과) ;
  • 김형국 (광운대학교 전파공학과)
  • Received : 2013.06.27
  • Accepted : 2013.08.28
  • Published : 2013.11.30

Abstract

The speech quality of multi-party audio telephony between two, three or more participants is decreased by audio volume imbalance, audio volume saturation and noise level increase. To solve this issue, this paper proposes an advanced audio mixing algorithm for software-based multi-point control unit. Our approach is based on the combined voice activity detection and gain control technique that consists of a set of algorithms that classify audio signals, estimate audio volumes, adjust gain factors and mix audio signals of all channels. The proposed audio mixing algorithm is computationally efficient, delivers high-quality speech, and is suitable for use in any practical multi-party audio telephony.

두세 명 혹은 그 이상의 참가자간사이의 다자간통화 시 음량불균형, 음량포화, 잡음레벨상승으로 인해서 음질 저하가 발생한다. 이 문제를 해결하기 위해서 본 논문은 소프트웨어 기반의 다지점제어장치를 위한 향상된 오디오 믹싱 알고리즘을 제안한다. 제안된 방식은 음성구간검출과 게인콘트롤이 결합된 기술로서 음성신호 분류, 음량 추정, 게인값 적용, 모든 채널의 음성신호를 믹싱하는 알고리즘들로 구성되어 있다. 제안된 오디오 믹싱 알고리즘은 효율적인 연산과 고품질의 음성을 제공하며, 실질적인 다자간 음성 통화에 적합하다.

Keywords

References

  1. D. Song, Y. Mo, and F. Wang, "Architecture of multiparty conferencing using SIP," in Proc. IEEE Int'l. Conf. on WiCOM, 2, 1361-1364 (2005).
  2. F. Xing G, U. Wei-Kang, and Y. Xiu-qing, "Research on fast real time adaptive audio mixing in multimedia conference," J. of Zhejiang Univ. Sci. 6, 507-512 (2005). https://doi.org/10.1631/jzus.2005.A0507
  3. S. V. Gerven and F. Xie, "A comparative study of speech detection methods," in Proc. of EUROSPEECH, 3, 1095-098 (1997).
  4. S. P. Chandra, K. M. Senthil, and M. P. P. Bala, "Audio mixer for multi-party conferencing in VoIP," in Poc. IEEE Int'l. Conf. on IMSAA, 1-6 (2009).
  5. V. M. Baskaran, and K. Wong "Audio mixer with automatic gain controller for software based multipoint control unit," APCCAS, 164-167 (2010).
  6. Y. S. Um, J. H. Chang, and D. K. Kim, "Signal subspace-based vioc activity detection using generalized gaussian distortion" (in Korean), J. Acoust. Soc. Kr. 32, 131-137 (2013). https://doi.org/10.7776/ASK.2013.32.2.131
  7. 3GPP TS 26.194, AMR Wideband Speech Codec : Voice Activity Detection (VAD), 2004.