• Title/Summary/Keyword: Codebook methods

Search Result 55, Processing Time 0.021 seconds

HMM-based Speech Recognition using DMS Model and Double Spectral Feature (DMS 모델과 이중 스펙트럼 특징을 이용한 HMM에 의한 음성 인식)

  • Ann Tae-Ock
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.4
    • /
    • pp.649-655
    • /
    • 2006
  • This paper proposes a HMM-based recognition method using DMSVQ(Dynamic Multi-Section Vector Quantization) codebook by DMS model and double spectral feature, as a method on the speech recognition of speaker-independent. LPC cepstrum parameter is used as a instantaneous spectral feature and LPC cepstrum's regression coefficient is used as a dynamic spectral feature These two spectral features are quantized as each VQ codebook. HMM using DMS model is modeled by receiving instantaneous spectral feature and dynamic spectral feature by input. Other experiments to compare with the results of recognition experiments using proposed method are implemented by the various conventional recognition methods under the equivalent environment of data and conditions. Through the experiment results, it is proved that the proposed method in this paper is superior to the conventional recognition methods.

  • PDF

HMM-based Speech Recognition using FSVQ, Fuzzy Concept and Doubly Spectral Feature (FSVQ, 퍼지 개념 및 이중 스펙트럼 특징을 이용한 HMM에 기초를 둔 음성 인식)

  • 정의봉
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.4
    • /
    • pp.491-502
    • /
    • 2004
  • In this paper, we propose a HMM model using FSVQ(First Section VQ), fuzzy theory and doubly spectral feature, as study on the isolated word recognition system of speaker-independent. In the proposed paper, LPC cepstrum coefficients and regression coefficients of LPC cepstrum as doubly spectral feature be used. And, training data are divided several section and first section is generated codebook of VQ, and then is obtained multi-observation sequences by order of large propabilistic values based on fuzzy nile from the codebook of the first section. Thereafter, this observation sequences of first section is trained and is recognized a word to be obtained highest probaility by same concept. Besides the speech recognition experiments of proposed method, we experiment the other methods under the equivalent environment of data and conditions. In the whole experiment, it is proved that the proposed method is superior to the others in recognition rate.

  • PDF

Iterative LBG Clustering for SIMO Channel Identification

  • Daneshgaran, Fred;Laddomada, Massimiliano
    • Journal of Communications and Networks
    • /
    • v.5 no.2
    • /
    • pp.157-166
    • /
    • 2003
  • This paper deals with the problem of channel identification for Single Input Multiple Output (SIMO) slow fading channels using clustering algorithms. Due to the intrinsic memory of the discrete-time model of the channel, over short observation periods, the received data vectors of the SIMO model are spread in clusters because of the AWGN noise. Each cluster is practically centered around the ideal channel output labels without noise and the noisy received vectors are distributed according to a multivariate Gaussian distribution. Starting from the Markov SIMO channel model, simultaneous maximum ikelihood estimation of the input vector and the channel coefficients reduce to one of obtaining the values of this pair that minimizes the sum of the Euclidean norms between the received and the estimated output vectors. Viterbi algorithm can be used for this purpose provided the trellis diagram of the Markov model can be labeled with the noiseless channel outputs. The problem of identification of the ideal channel outputs, which is the focus of this paper, is then equivalent to designing a Vector Quantizer (VQ) from a training set corresponding to the observed noisy channel outputs. The Linde-Buzo-Gray (LBG)-type clustering algorithms [1] could be used to obtain the noiseless channel output labels from the noisy received vectors. One problem with the use of such algorithms for blind time-varying channel identification is the codebook initialization. This paper looks at two critical issues with regards to the use of VQ for channel identification. The first has to deal with the applicability of this technique in general; we present theoretical results for the conditions under which the technique may be applicable. The second aims at overcoming the codebook initialization problem by proposing a novel approach which attempts to make the first phase of the channel estimation faster than the classical codebook initialization methods. Sample simulation results are provided confirming the effectiveness of the proposed initialization technique.

A Design of a Robust Vector Quantizer for Wavelet Transformed Images (웨이브렛벤환 영상 부호화용 범용 벡터양자화기의 설계)

  • Do, Jae-Su;Cho, Young-Suk
    • Convergence Security Journal
    • /
    • v.6 no.4
    • /
    • pp.83-90
    • /
    • 2006
  • In this paper, we propose a new design method for a robust vector quantizer that is independent of the statistical characteristics of input images in the wavelet transformed image coding. The conventional vector quantizers have failed to get quality coding results because of the different statistical properties between the image to be quantized and the training sequence for a codebook of the vector quantizer. Therefore, in order to solve this problem, we used a pseudo image as a training sequence to generate a codebook of the vector quantizer; the pseudo image is created by adding correlation coefficient and edge components to uniformly distributed random numbers. We will clearly define the problem of the conventional vector quantizers, which use real images as a training sequence to generate a codebook used, by comparing the conventional methods with the proposed through computer simulation. Also, we will show the proposed vector quantizer yields better coding results.

  • PDF

HMM-based Speech Recognition using FSVQ and Fuzzy Concept (FSVQ와 퍼지 개념을 이용한 HMM에 기초를 둔 음성 인식)

  • 안태옥
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.90-97
    • /
    • 2003
  • This paper proposes a speech recognition based on HMM(Hidden Markov Model) using FSVQ(First Section Vector Quantization) and fuzzy concept. In the proposed paper, we generate codebook of First Section, and then obtain multi-observation sequences by order of large propabilistic values based on fuzzy rule from the codebook of the first section. Thereafter, this observation sequences of first section from codebooks is trained and in case of recognition, a word that has the most highest probability of first section is selected as a recognized word by same concept. Train station names are selected as the target recognition vocabulary and LPC cepstrum coefficients are used as the feature parameters. Besides the speech recognition experiments of proposed method, we experiment the other methods under same conditions and data. Through the experiment results, it is proved that the proposed method based on HMM using FSVQ and fuzzy concept is superior to tile others in recognition rate.

VQ Codebook Design and Feature Extraction of Image Information for Multimedia Information Searching (멀티미디어 정보검색에 적합한 영상정보의 벡터 양자화 코드북 설계 및 특징추출)

  • Seo, Seok-Bae;Kim, Dae-Jin;Kang, Dae-Seong
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.8
    • /
    • pp.101-112
    • /
    • 1999
  • In this paper, the codebook design method of VQ (vector quantization) is proposed an method to extract feature data of image for multimedia information searching. Conventional VQ codebook design methods are unsuitable to extract the feature data of images because they have too much computation time, memory for vector decoding and blocking effects like DCT (discrete cosine transform). The proposed design method is consists of the feature extraction by WT (wavelet transform) and the data group divide method by PCA (principal component analysis). WT is introduced to remove the blocking effect of an image with high compressing ratio. Computer simulations show that the proposed method has the better performance in processing speed than the VQ design method using SOM (self-organizing map).

  • PDF

Transcoding Algorithm for SMV and AMR Speech Coder (SMV와 AMR 음성부호화기를 위한 상호부호화 알고리즘)

  • Lee, Duck-Jong;Jeong, Gyu-Hyeok;Lee, In-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.8
    • /
    • pp.427-434
    • /
    • 2008
  • In this paper, a transcoding algorithm for SMV and AMR speech coder is proposed. In the application requiring the interoperability of different networks, two speech coders must work together with the structure of cascaded connection, tandem. The tandem which is one of the simplest methods has several problems such as long delay, high complexity and the quality degradation due to twice complete encoding/decoding process. These problems can be solved by using transcoding algorithm. The proposed algorithm consists of LSP (Line Spectral Pair) conversion, pitch delay conversion, and fast fixed codebook search. The evaluation results show that the proposed algorithm achieves equivalent speech quality to that of tandem with reduced computational complexity and delay.

A Comparative Study of Speaker Adaptation Methods for HMM-Based Speech Recognition (HMM 음성인식 시스템을 위한 화자적응 방법들의 성능비교)

  • Koo, Myoung-Wan;Un, Chong-Kwan;Lee, Hwang-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.3
    • /
    • pp.37-43
    • /
    • 1991
  • In this paper, we compare the performances of speaker adaptation which consist of two stages of processing for an HMM-based speech recognition system. We compare three kinds of VQ adaptation methods which may be used in the first stage to reduce the distortion error for a new speaker : label prototype adaptation, adaptation with a codebook from adaptation speech itself, and adaptation with a mapped codebook. We then compare the performance of four kinds of HMM parameter adaptation methods which may be used in the second stage to transform HMM parameters for a new speaker : adaptation by the Viterbi algorithm, that by the DTW algorithm, that by the iterative alignment algorithm. The results show that adaptation based on the fuzzy histogram algorithm yields the highest accuracy in an HMM-based speech recognition system.

  • PDF

A Study on the Advanced Vector Quantization Algorithm for Edge Preserving (윤관보존을 위한 개선된 벡터 양자화 알고리즘에 관한 연구)

  • 김백기;이대영
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.12
    • /
    • pp.72-80
    • /
    • 1994
  • In this paper, we present a digital image data compression method using vector quantization preserving edges. A new vector quantization algorithm is proposed using a new sampling method and edge region extraction. The codebook generation time is faster than existing algorithms and the quality of decompressed images is much improved. Extrimental results suggest that the resultant compression ratio and PSNR are better than those of BPVQ and HMVQ methods.

  • PDF

Color Image Vector Quantization Using Enhanced SOM Algorithm

  • Kim, Kwang-Baek
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.12
    • /
    • pp.1737-1744
    • /
    • 2004
  • In the compression methods widely used today, the image compression by VQ is the most popular and shows a good data compression ratio. Almost all the methods by VQ use the LBG algorithm that reads the entire image several times and moves code vectors into optimal position in each step. This complexity of algorithm requires considerable amount of time to execute. To overcome this time consuming constraint, we propose an enhanced self-organizing neural network for color images. VQ is an image coding technique that shows high data compression ratio. In this study, we improved the competitive learning method by employing three methods for the generation of codebook. The results demonstrated that compression ratio by the proposed method was improved to a greater degree compared to the SOM in neural networks.

  • PDF