Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

Kim, Hyoung-Gook;

The Journal of the Acoustical Society of Korea

Volume 25 Issue 3E
/
Pages.89-94
/
2006
/
1225-4428(pISSN)

The Acoustical Society of Korea (한국음향학회)

Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

Kim, Hyoung-Gook (Samsung Advanced Institute of Technology)

Published : 2006.09.15

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

For audio indexing and targeted search of specific audio or corresponding visual contents, the MPEG-7 standard has adopted a sound classification framework, in which dimension-reduced Audio Spectrum Projection (ASP) features are used to train continuous hidden Markov models (HMMs) for classification of various sounds. The MPEG-7 employs Principal Component Analysis (PCA) or Independent Component Analysis (ICA) for the dimensional reduction. Other well-established techniques include Non-negative Matrix Factorization (NMF), Linear Discriminant Analysis (LDA) and Discrete Cosine Transformation (DCT). In this paper we compare the performance of different dimensional reduction methods with Gaussian mixture models (GMMs) and HMMs in the classifying video sound clips.

Keywords

References

B. S. Manjunath, P Salembier and T. Sikora, Introduction to MPEG-7, (Wiley 2002)
H.-G. Kim, N. Moreau, T. Sikora, MPEG-7 Audio and beyond, (Wiley 2005)
L. Rabiner and B.-H. Juang, Fundamentals of speech recognition, (Prentice Hall, N.J. 1993)
I. T. Jolliffe, Principal component analysis, (Springer-Verlag 1996)
A. Hyvarinen, E, Oja, 'Independent component analysis: algorithms and applications,' Neural Networks, vol.. 13, 411-430 (2000) https://doi.org/10.1016/S0893-6080(00)00026-5
D.D. Lee and H.S. Seung, 'Algorithms for non-negative matrix factorization' Adv. Neural Info. Proc. Syst. 13, 556-562 (2001)
N. Marhav and C.-H. Lee, 'On the asymptotic statistical behavior of empirical ceptsral coefficients' in IEEE Transactions on Signal Processing 41, 1990-1993 (1993) https://doi.org/10.1109/78.215323
R. Ouda, Pattern classification, (John Wiley 2001)

The Journal of the Acoustical Society of Korea

Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)