A Study of Noise Robust Content-Based Music Retrieval System

잡음에 강인한 내용기반 음악 검색 시스템에 대한 연구

  • Yoon, Won-Jung (Dept. of Computer Science and Statistics, Dankook University) ;
  • Park, Kyu-Sik (Dept. of Computer Science and Statistics, Dankook University)
  • 윤원중 (단국대학교 컴퓨터과학 및 통계학과) ;
  • 박규식 (단국대학교 컴퓨터과학 및 통계학과)
  • Published : 2008.11.25

Abstract

In this paper, we constructed the noise robust content-based music retrieval system in mobile environment. The performance of the proposed system was verified with ZCPA feature which is blown to have noise robust characteristic in speech recognition application. In addition, new indexing and fast retrieval method are proposed to improve retrieval speed about 99% compare to exhaustive retrieval for large music DB. From the computer simulation results in noise environment of 15dB - 0dB SNR, we confirm the superior performance of the proposed system about 5% - 30% compared to MFCC and FBE(filter bank energy) feature.

본 논문에서는 모바일 환경에서 적용 가능한 잡음에 강인한 내용기반 음악 검색 시스템을 구축하였다. 제안된 시스템은 기존의 음성인식 분야에서 잡음에 강인한 특성을 가진 것으로 알려진 ZCPA 특징을 내용기반 음악 검색 시스템에 적용시켜 그 성능을 검증하였다. 또한 본 논문에서는 대용량 음악 DB 검색에서 기존의 전수(Exhaustive) 검색에 비해 검색 속도를 99% 가까이 개선할 수 있는 새로운 인덱싱 방법과 고속 검색 알고리즘을 제안하였다. 신호대 잡음비가 15dB - 0dB인 잡음 환경에서의 모의실험 결과, 제안 시스템은 기존의 MFCC와 필터뱅크 에너지 특징에 비해 약 5% - 30% 정도의 우수한 성능을 나타냄을 확인하였다.

Keywords

References

  1. "2005 음악산업백서," 문화관광부
  2. E. Wold, T. Blum, D. Keislar and J. Wheaton, "Content-Based Classification, Search and Retrieval of Audio," IEEE Multimedia, Vol. 3, No. 3, pp.27-36, Fall 1996 https://doi.org/10.1109/93.556537
  3. G. Tzanetakis, "Manipulation, Analysis and Retrieval Systems for Audio Signals," Princeton Computer Science Technical Report TR-651-02, June, 2002
  4. G. Li and A. Khokhar, "Content-based Indexing and Retrieval of Audio Data using Wavelets," IEEE Int. Conf. on Multimedia and Expo, 2000
  5. B. Logan, "Music recommendation from song sets," International Symposium on Music Information Retrieval, October, 2004
  6. Sheng Gao, Chin-Hui Lee and Qi Tian, "Indexing with Musical Events and Its Application to Content-Based Music Identification," 17th International Conference on Pattern Recognition (ICPR'04), Cambridge, UK, August 2004
  7. C. J. Burges, J. C. Platt and S. Jana, "Distortion Discriminant Analysis for Audio Fingerprinting," IEEE Trans. on Speech and Audio Proc., Vol.11, No.3, pp.165-174, 2003 https://doi.org/10.1109/TSA.2003.811538
  8. E. Allamanche, etc. "Content-based identification of audio material using MPEG-7 low level description," International Symposium on Music Information Retrieval, October, 2001
  9. S. Sukittanon, L. Atlas, and J. Pitton, "Modulation-scale analysis for content identification," IEEE Transactions on Signal Processing, Volume 52, Issue 10, pp. 3023-3035, October, 2004 https://doi.org/10.1109/TSP.2004.833861
  10. J. Haitsma, and T. Kalker, "A Highly Robust Audio Fingerprinting System," International Symposium on Music Information Retrieval, October 2002
  11. Shazam, http://www.shazam.com
  12. Doh-Suk Kim, Soo-Young Lee, Rhee M. Kil, "Auditory Processing of Speech Signals for Robust Speech Recognition in Real-World Noisy Environments," IEEE Transactions on Speech and Audio Processing, Vol. 7, No. 1, Jan., 1999
  13. D.Fragoulis, G. Roussopoulos, Th. Panagopoulos, C. Alexiou, C. Papaodysseus, "On the automated recognition of seriously distorted musical recordings," IEEE Transactions on Signal Processing, Vol. 49, No 4, pp. 898-908, July 2000 https://doi.org/10.1109/78.912932
  14. B. Logan, "Mel Frequency Cepstral Coefficients for Music Modeling," International Symposium on Music Information Retrieval, October 2000
  15. E. Allamanche, J. Herre, O. Hellmuth, B. Bernhard Fröbach, and M. Cremer, "AudioID: Towards Content-Based Identification of Audio Material," 100th AES Convention, Amsterdam, The Netherlands, May 2001
  16. O. Ghitza, "Auditory models and human performance in tasks related to speech coding and speech recognition," IEEE Transactions on Speech Audio Process, Vol. 2, pp. 115-132, Jan., 1994 https://doi.org/10.1109/89.260357
  17. Hyoung-Gook Kim, Ki-Wan Eom, "Robust Music Identification using Long-Term Dynamic Modulation Spectrum," The Journal of the Acoustical Society of Korea, Vol. 25, No. 2E, pp. 69-73, June, 2006