A Novel Speech/Music Discrimination Using Feature Dimensionality Reduction

  • Received : 2009.12.07
  • Accepted : 2010.01.15
  • Published : 2010.03.25


In this paper, we propose an improved speech/music discrimination method based on a feature combination and dimensionality reduction approach. To improve discrimination ability, we use a feature based on spectral duration analysis and employ the hierarchical dimensionality reduction (HDR) method to reduce the effect of correlated features. Through various kinds of experiments on speech and music, it is shown that the proposed method showed high discrimination results when compared with conventional methods.


  1. J. Saunders, “Real-time discrimination of broadcast speech/ music,” Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 993-996, 1996.
  2. E. Scheirer and M. Slaney, “Construction and evaluation of a robust multifeature speech/music discriminator,” Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1331-1334, 1997.
  3. T. Zhang and J. Kuo, “Audio content analysis for online audiovisual data segmentation and classification,” IEEE Trans. on Speech and Audio Processing, vol. 9, no. 4, pp. 441-457, 2001.
  4. L. Lu, H. Zhang, and H. Jiang, “Content analysis for audio classification and segmentation,” IEEE Trans. on Speech and Audio Processing, vol. 10, no. 7, pp. 504-516, 2002.
  5. J.-S. Keum and H.-S. Lee, “Speech/music discrimination using spectral peak feature for speaker indexing,” Proc. IEEE Int. Sym. on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 323-326, 2006.
  6. J.-S. Keum, S.-K. Lim, and H.-S. Lee, “Speech/music discrimination using spectrum analysis and neural network,” The Journal of the Acoustical Society of Korea, vol. 26, no. 5, pp. 207-213, 2007.
  7. J.-S. Keum, H.-S. Lee, and M. Hagiwara, “An improved speech/nonspeech classification based on feature combination for audio indexing,” IEICE Trans. on Fundamentals, vol. E93-A, no. 4, 2010.
  8. J.R. Deller, J.G. Proakis, and J.H.L. Hansen, Discrete Time Processing of Speech Signals, Prentice Hall, 1987.
  9. R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, John Wiley & Sons, 2001.
  10. AR Abu-El-Quran and RA Goubran, “Security-monitoring using microphone arrays and audio classification,” IEEE Trans. on Instrumentation and Measurement, vol. 55, no. 4, pp. 1025-1032, 2006.

Cited by

  1. A Classification Method Using Data Reduction vol.12, pp.1, 2012,