Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

Kim, Hyoung-Gook;

The Journal of the Acoustical Society of Korea

제25권3E호
/
Pages.89-94
/
2006
/
1225-4428(pISSN)

한국음향학회 (The Acoustical Society of Korea)

Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

Kim, Hyoung-Gook (Samsung Advanced Institute of Technology)

발행 : 2006.09.15

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

For audio indexing and targeted search of specific audio or corresponding visual contents, the MPEG-7 standard has adopted a sound classification framework, in which dimension-reduced Audio Spectrum Projection (ASP) features are used to train continuous hidden Markov models (HMMs) for classification of various sounds. The MPEG-7 employs Principal Component Analysis (PCA) or Independent Component Analysis (ICA) for the dimensional reduction. Other well-established techniques include Non-negative Matrix Factorization (NMF), Linear Discriminant Analysis (LDA) and Discrete Cosine Transformation (DCT). In this paper we compare the performance of different dimensional reduction methods with Gaussian mixture models (GMMs) and HMMs in the classifying video sound clips.

키워드

참고문헌

B. S. Manjunath, P Salembier and T. Sikora, Introduction to MPEG-7, (Wiley 2002)
H.-G. Kim, N. Moreau, T. Sikora, MPEG-7 Audio and beyond, (Wiley 2005)
L. Rabiner and B.-H. Juang, Fundamentals of speech recognition, (Prentice Hall, N.J. 1993)
I. T. Jolliffe, Principal component analysis, (Springer-Verlag 1996)
A. Hyvarinen, E, Oja, 'Independent component analysis: algorithms and applications,' Neural Networks, vol.. 13, 411-430 (2000) https://doi.org/10.1016/S0893-6080(00)00026-5
D.D. Lee and H.S. Seung, 'Algorithms for non-negative matrix factorization' Adv. Neural Info. Proc. Syst. 13, 556-562 (2001)
N. Marhav and C.-H. Lee, 'On the asymptotic statistical behavior of empirical ceptsral coefficients' in IEEE Transactions on Signal Processing 41, 1990-1993 (1993) https://doi.org/10.1109/78.215323
R. Ouda, Pattern classification, (John Wiley 2001)

The Journal of the Acoustical Society of Korea

Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)