Search | Korea Science

Speech Recognition Using HMM Based on Fuzzy (피지에 기초를 둔 HMM을 이용한 음성 인식)

안태옥;김순협
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.28B no.12
- /
- pp.68-74
- /
- 1991
This paper proposes a HMM model based on fuzzy, as a method on the speech recognition of speaker-independent. In this recognition method, multi-observation sequences which give proper probabilities by fuzzy rule according to order of short distance from VQ codebook are obtained. Thereafter, the HMM model using this multi-observation sequences is generated, and in case of recognition, a word that has the most highest probability is selected as a recognized word. The vocabularies for recognition experiment are 146 DDD are names, and the feature parameter is 10S0thT LPC cepstrum coefficients. Besides the speech recognition experiments of proposed model, for comparison with it, we perform the experiments by DP, MSVQ and general HMM under same condition and data. Through the experiment results, it is proved that HMM model using fuzzy proposed in this paper is superior to DP method, MSVQ and general HMM model in recognition rate and computational time.
PDF

A Reduction Algorithm of Computational Amount using Adjustment the Not Uniform Interval and Distribution Characteristic of LSP (불균등 간격조절과 선형 스펙트럼 쌍 분포특성을 이용한 계산량 단축 알고리즘)

Ju, Sang-Gyu
- Proceedings of the KAIS Fall Conference
- /
- 2010.05a
- /
- pp.261-264
- /
- 2010
Fast algorithm is proposed by using mel scale and the distribution characteristic of LSP parameters, and is to reduce the computational amount. Computational amount means the calculating times of transformation from LPC coefficients to LSP parameters. Among conventional methods, the real root method is considerably simpler than other, but neverthless, it still suffer from its indeterministic computational time. Because the root searching is processed sequentially in frequency region. In this paper, the searching interval is arranged by using mel scale but not it is uniform and searching order is arranged by the distribution characteristic of LSP parameters that is most LSP papameters are occured in specific frequency region. In experimental results, computational amount of the proposed algorithm is reduced about 48.95% in average, but the transformed LSP parameters of the proposed method were the same as those of real root method.
PDF

Design of the Vector-Scalar Quantizer of LSP Parameters for Wideband Speech Coder (광대역 음성부호화기를 위한 백터-스칼라 LSP 파라미터 양자화기 설계)

신재현;이인성;지덕구;윤병식;최송인
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.40 no.4
- /
- pp.286-291
- /
- 2003
In this Paper, we designed an LSP(Line Spectral Pairs) parameter quantizer with cascaded structure of vector quantizer and scalar quantizer for the wideband speech coder. We have chosen the 16th-order of the LP coefficients. These coefficients are then transformed into the LSP parameters which have the excellent properties for quantization and easy stability checking condition of synthesis filter. In the first stage of quantization, input LSP parameters are split-vector-quantized using two 8-th order codebooks. In the second stage, the components of residual vector are individually quantized by the scalar quantizer utilizing the ordering property of LSP parameters. The designed adaptive VQ-SQ quantizer using 35 bits/frame shows the wideband transparency that the average spectral distortion should be less than 1.6 ㏈ and less than 4% of the frames should have SD above 3 ㏈. The simulation results show that the designed quantizer provides a 2-3 bits/frame saving over the typical vector-scalar quantizer.
PDF KSCI

Analysis of the Time Delayed Effect for Speech Feature (음성 특징에 대한 시간 지연 효과 분석)

Ahn, Young-Mok
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.1
- /
- pp.100-103
- /
- 1997
In this paper, we analyze the time delayed effect of speech feature. Here, the time delayed effect means that the current feature vector of speech is under the influence of the previous feature vectors. In this paper, we use a set of LPC driven cepstal coefficients and evaluate the time delayed effect of cepstrum with the performance of the speech recognition system. For the experiments, we used the speech database consisting of 22 words which uttered by 50 male speakers. The speech database uttered by 25 male speakers was used for training, and the other set was used for testing. The experimental results show that the time delayed effect is large in the lower orders of feature vector but small in the higher orders.
PDF

An Effective Feature Extraction Method for Fault Diagnosis of Induction Motors (유도전동기의 고장 진단을 위한 효과적인 특징 추출 방법)

Nguyen, Hung N.;Kim, Jong-Myon
- Journal of the Korea Society of Computer and Information
- /
- v.18 no.7
- /
- pp.23-35
- /
- 2013
This paper proposes an effective technique that is used to automatically extract feature vectors from vibration signals for fault classification systems. Conventional mel-frequency cepstral coefficients (MFCCs) are sensitive to noise of vibration signals, degrading classification accuracy. To solve this problem, this paper proposes spectral envelope cepstral coefficients (SECC) analysis, where a 4-step filter bank based on spectral envelopes of vibration signals is used: (1) a linear predictive coding (LPC) algorithm is used to specify spectral envelopes of all faulty vibration signals, (2) all envelopes are averaged to get general spectral shape, (3) a gradient descent method is used to find extremes of the average envelope and its frequencies, (4) a non-overlapped filter is used to have centers calculated from distances between valley frequencies of the envelope. This 4-step filter bank is then used in cepstral coefficients computation to extract feature vectors. Finally, a multi-layer support vector machine (MLSVM) with various sigma values uses these special parameters to identify faulty types of induction motors. Experimental results indicate that the proposed extraction method outperforms other feature extraction algorithms, yielding more than about 99.65% of classification accuracy.
https://doi.org/10.9708/jksci.2013.18.7.023 인용 PDF KSCI

Design of a 4kb/s ACELP Codec Using the Generalized AbS Principle (Generalized AbS 구조를 이용한 4kb/s ACELP 음성 부호화기의 설계)

성호상;강상원
- The Journal of the Acoustical Society of Korea
- /
- v.18 no.7
- /
- pp.33-38
- /
- 1999
In this paper, we combine a generalized analysis-by-synthesis (AbS) structure and an algebraic excitation scheme to propose a new 4kb/s speech codec. This codec partly uses the structure of G.729. We design a line spectrum pair (LSP) quantizer, an adaptive codebook, and an excitation codebook to fit the 4 kb/s bit rate. The codec has a 25㎳ algorithmic delay, which corresponds to a 20㎳ frame size and a 5㎳ lookahead. At the bit rates below 4kb/s, most CELP speech codecs using the AbS principle have a drawback that results a rapid degradation of speech quality. To overcome this drawback we use the generalized AbS structure which is efficient for the low bit rate speech codec. LP coefficients are converted to LSP and quantized using a predictive 2-stage VQ. A low complexity algebraic codebook which uses shifting method is used for the fixed codebook excitation, and gains of the adaptive codebook and the fixed codebook are quantized using the VQ. To evaluate the performance of the proposed codec A-B preference tests are done with the fixed rate 8kb/s QCELP. As the result of the test, the performance of the codec is similar to that of the fixed rate 8kb/s QCELP.
PDF

A Study on A Multi-Pulse Linear Predictive Filtering And Likelihood Ratio Test with Adaptive Threshold (멀티 펄스에 의한 선형 예측 필터링과 적응 임계값을 갖는 LRT의 연구)

Lee, Ki-Yong;Lee, Joo-Hun;Song, Iick-Ho;Ann, Sou-Guil
- The Journal of the Acoustical Society of Korea
- /
- v.10 no.1
- /
- pp.20-29
- /
- 1991
A fundamental assumption in conventional linear predictive coding (LPC) analysis procedure is that the input to an all-pole vocal tract filter is white process. In the case of periodic inputs, however, a pitch bias error is introduced into the conventional LP coefficient. Multi-pulse (MP) LP analysis can reduce this bias, provided that an estimate of the excitation is available. Since the prediction error of conventional LP analysis can be modeled as the sum of an MP excitation sequence and a random noise sequence, we can view extracting MP sequences from the prediction error as a classical detection and estimation problem. In this paper, we propose an algorithm in which the locations and amplitudes of the MP sequences are first obtained by applying a likelihood ratio test (LRT) to the prediction error, and LP coefficients free of pitch bias are then obtained from the MP sequences. To verify the performance enhancement, we iterate the above procedure with adaptive threshold at each step.
PDF

Double Talk Detection before the Convergence of Echo Canceller (반향제거기의 수렴전 동시통화검출)

Yoo, Jae-Ha;Kim, Soo-Chan;Kim, Dong-Yon
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.13 no.5
- /
- pp.203-208
- /
- 2013
In this paper, we proposed a performance improvement method of the double talk detector which can operate before the echo canceller converges. Microphone input signal is filtered by the linear prediction filter and this filtered signal is used for detection. The coefficients of the linear prediction filter are given by the far-end talker signal. During single talk, filtered signal has low power since the characteristics of the echo signal is similar with those of the far-end talker signal. But, during double talk, the filtered signal does not have low power because the signal of different characteristics is included in the microphone signal. Double talk is detected by this difference. Simulations using real speech signals verified that the proposed method outperformed the conventional methods.
https://doi.org/10.7236/JIIBC.2013.13.5.203 인용 PDF KSCI

The Reduction Algorithm of Complexity using Adjustment of Resolution and Search Sequence for Vocoder (해상도 조절과 검색순서 조절을 통한 음성부호화기용 복잡도 감소 알고리즘)

Min, So-Yeon;Lee, Kwang-Hyoung;Bae, Myung-Jin
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.8 no.5
- /
- pp.1122-1127
- /
- 2007
We propose the complexity reduction algorithm of real root method that is mainly used in the Vocoder. The real root method is that if polynomial equations have the real roots, we are able to find those and transform them into LSP(Line Spectrum Pairs). However, this method takes much time to compute, because the root searching is processed sequentially in frequency region. The important characteristic of LSP is that most of coefficients are occurred in specific frequency region. So, the searching frequency region is ordered and adjusted by each coefficient's distribution in this paper. Transformation time can be reduced by proposed algorithm than the sequential searching method in frequency region. When we compare this proposed method with the conventional real root method, the experimental result is that the searching time was reduced about 48% in average.
PDF

Search Result 79, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)