• Title/Summary/Keyword: Waveform Encoding

Search Result 14, Processing Time 0.029 seconds

Music Genre Classification using Spikegram and Deep Neural Network (스파이크그램과 심층 신경망을 이용한 음악 장르 분류)

  • Jang, Woo-Jin;Yun, Ho-Won;Shin, Seong-Hyeon;Cho, Hyo-Jin;Jang, Won;Park, Hochong
    • Journal of Broadcast Engineering
    • /
    • v.22 no.6
    • /
    • pp.693-701
    • /
    • 2017
  • In this paper, we propose a new method for music genre classification using spikegram and deep neural network. The human auditory system encodes the input sound in the time and frequency domain in order to maximize the amount of sound information delivered to the brain using minimum energy and resource. Spikegram is a method of analyzing waveform based on the encoding function of auditory system. In the proposed method, we analyze the signal using spikegram and extract a feature vector composed of key information for the genre classification, which is to be used as the input to the neural network. We measure the performance of music genre classification using the GTZAN dataset consisting of 10 music genres, and confirm that the proposed method provides good performance using a low-dimensional feature vector, compared to the current state-of-the-art methods.

Real-time Implementation of Variable Transmission Bit Rate Vocoder Improved Speech Quality in SOLA-B Algorithm & G.729A Vocoder Using on the TMS320C5416 (TMS320C5416을 이용한 SOLA-B 알고리즘과 G.729A 보코더의 음질 향상된 가변 전송률 보코더의 실시간 구현)

  • Ham, Myung-Kyu;Bae, Myung-Jin
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.241-250
    • /
    • 2003
  • In this paper, we implemented the vocoder of variable rate by applying the SOLA-B algorithm to the G.729A to the TMS320C5416 in real-time. This method using the SOLA-B algorithm is that it is reduced the duration of the speech in encoding and is played at the speed of normal by extending the duration of the speech in decoding. But the method applied to the existed G.729A and SOLA-B algorithm is caused the loss of speech quality in G.729A which is not reflected about length variation of speech. Therefore the proposed method is encoded according as it is modified the structure of LSP quantization table about the length of speech is reduced by using the SOLA-B algorithm. The vocoder of variable rate by applying the G.729A and SOLA-B algorithm is represented the maximum complexity of 10.2MIPS about encoder and 2.8MIPS about decoder in 8kbps transmission rate. Also it is evaluated 17.3MIPS about encoder, 9.9MIPS about decoder in 6kbps and 18.5MIPS about encoder, 11.1MIPS about decoder in 4kbps according to the transmission rate. The used memory is about program ROM 9.7kwords, table ROM 4.69kwords, RAM 5.2kwords. The waveform of output is showed by the result of C simulator and Bit Exact. Also, the result of MOS test for evaluation of speech quality of the vocoder of variable rate which is implemented in real-time, it is estimated about 3.68 in 4kbps.

  • PDF

Real-time Implementation of Variable Transmission Bit Rate Vocoder Integrating G.729A Vocoder and Reduction of the Computational Amount SOLA-B Algorithm Using the TMS320C5416 (TMS320C5416을 이용한 G.729A 보코더와 계산량 감소된 SOLA-B 알고리즘을 통합한 가변 전송율 보코더의 실시간 구현)

  • 함명규;배명진
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.84-89
    • /
    • 2003
  • In this paper, we real-time implemented to the TMS320C5416 the vocoder of variable bit rate applied the SOLA-B algorithm by Henja to the ITU-T G.729A vocoder of 8kbps transmission rate. This proposed method using the SOLA-B algorithm is that it is reduced the duration of the speech in encoding and is played at the speed of normal by extending the duration of the speech in decoding. At this time, we bandied that the interval of cross correlation function if skipped every 3 sample for decreasing the computational amount of SOLA-B algorithm. The real-time implemented vocoder of C.729A and SOLA-B algorithm is represented the complexity of maximum that is 10.2MIPS in encoder and 2.8MIPS in decoder of 8kbps transmission rate. Also, it is represented the complexity of maximum that is 18.5MIPS in encoder and 13.1MIPS in decoder of 6kbps, it is 18.5MIPS in encoder and 13.1MIPS in decoder of 4kbps. The used memory is about program ROM 9.7kwords, table ROM 4.5kwords, RAM 5.1 kwords. The waveform of output is showed by the result of C simulator and Bit Exact. Also, for evaluation of speech quality of the vocoder of real-time implemented variable bit rate, it is estimated the MOS score of 3.69 in 4kbps.

The Design of Optimal Filters in Vector-Quantized Subband Codecs (벡터양자화된 부대역 코덱에서 최적필터의 구현)

  • 지인호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.97-102
    • /
    • 2000
  • Subband coding is to divide the signal frequency band into a set of uncorrelated frequency bands by filtering and then to encode each of these subbands using a bit allocation rationale matched to the signal energy in that subband. The actual coding of the subband signal can be done using waveform encoding techniques such as PCM, DPCM and vector quantizer(VQ) in order to obtain higher data compression. Most researchers have focused on the error in the quantizer, but not on the overall reconstruction error and its dependence on the filter bank. This paper provides a thorough analysis of subband codecs and further development of optimum filter bank design using vector quantizer. We compute the mean squared reconstruction error(MSE) which depends on N the number of entries in each code book, k the length of each code word, and on the filter bank coefficients. We form this MSE measure in terms of the equivalent quantization model and find the optimum FIR filter coefficients for each channel in the M-band structure for a given bit rate, given filter length, and given input signal correlation model. Specific design examples are worked out for 4-tap filter in 2-band paraunitary filter bank structure. These optimum paraunitary filter coefficients are obtained by using Monte Carlo simulation. We expect that the results of this work could be contributed to study on the optimum design of subband codecs using vector quantizer.

  • PDF