• Title/Summary/Keyword: Cepstrum

Search Result 274, Processing Time 0.025 seconds

고차 모멘트 Cepstrum을 이용한 구름 베어링의 결함검출

  • Kim, Young-Tae;Choi, Man-Yong;Kim, Ki-Bok;Park, Hae-Won;Park, Jung-Hak;Yoo, Jun
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2004.05a
    • /
    • pp.191-191
    • /
    • 2004
  • 베어링은 회전기계에서 가장 일반적인 구성요소로 베어링의 초기 결함 또는 퇴화현상이 사전에 발견되지 않으면 회전기계의 고장 또는 파손으로 엄청난 손실이 초래될 수 있다. 베어링의 초기 결함을 검출하기 위한 가장 보편적인 방법으로 베어링 진동신호의 특징적인 패턴을 검출하는 것이다.(중략)

  • PDF

Analysis of Impedance of Multilayer Structure using Cepstrum Technique (켑스트럼 기법을 이용한 다층구조물의 임피던스 해석)

  • Shin, Jin-Seob;Jun, Kye-Suk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.4
    • /
    • pp.85-89
    • /
    • 1997
  • In this paper, the imdedance for each layer using triple cepstrum signal processing for reflected ultrasonic signal from the multilayer structure has been analyzed. The reflection coefficient can be obtained from the amplitude and the polarity of the peaks in the triple cepstrum, and then the impedance of each layer has been reconstructed by the reflection coefficient. In this experiment, four types of multilayers consisting of different metal layers were manufactured. The reflected signals from the multilayer structure have been detected by pulse-echo method. The impedances have been reconstructed by triple cepstrum technique. The experimental results have been in good agreement with the theoretical results.

  • PDF

Selecting Good Speech Features for Recognition

  • Lee, Young-Jik;Hwang, Kyu-Woong
    • ETRI Journal
    • /
    • v.18 no.1
    • /
    • pp.29-41
    • /
    • 1996
  • This paper describes a method to select a suitable feature for speech recognition using information theoretic measure. Conventional speech recognition systems heuristically choose a portion of frequency components, cepstrum, mel-cepstrum, energy, and their time differences of speech waveforms as their speech features. However, these systems never have good performance if the selected features are not suitable for speech recognition. Since the recognition rate is the only performance measure of speech recognition system, it is hard to judge how suitable the selected feature is. To solve this problem, it is essential to analyze the feature itself, and measure how good the feature itself is. Good speech features should contain all of the class-related information and as small amount of the class-irrelevant variation as possible. In this paper, we suggest a method to measure the class-related information and the amount of the class-irrelevant variation based on the Shannon's information theory. Using this method, we compare the mel-scaled FFT, cepstrum, mel-cepstrum, and wavelet features of the TIMIT speech data. The result shows that, among these features, the mel-scaled FFT is the best feature for speech recognition based on the proposed measure.

  • PDF

Vowel Recognition Using the Fractal Dimensioin (프랙탈 차원을 이용한 모음인식)

  • 최철영
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.364-367
    • /
    • 1994
  • In this paper, we carried out some experiments on the Korean vowel recognition using the fractal dimension of the speech signals. We chose the Mincowski-Bouligand dimensioni as the fractal dimension, and computed it using the morphological covering method. For our experiments, we used both the fractal dimension and the LPC cepstrum which is conventionally known to be one of the best parameters for speech recognition, and examined the usefulness of the fractal dimension. From the vowel recognition experiments under various consonant contexts, we achieved the vowel recognition error rats of 5.6% and 3.2% for the case with only LPC cepstrum and that with both LPC cepstrum and the fractal dimension, respectively. The results indicate that the incorporation of the fractal dimension with LPC cepstrum gies more than 40% reduction in recognition errors, and indicates that the fractal dimension is a useful feature parameter for speech recognition.

  • PDF

Application of the Cepstrum Signal Processing Technique for the Noise Reflection Path Analysis in Community Noise (소음전달경로 분석 : 켑스트럼(Cepstrum) 적용방안에 관한 연구)

  • Hong, Yun-H.;Kim, Jeung-T.
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.19 no.5
    • /
    • pp.447-453
    • /
    • 2009
  • Community noise has been great concerned in public. A traffic noise from a road or a railway has affected too mush damage on quiet living environment. In this paper, a measured noise signal on a street has been applied to extract a noise source and a path by using a complex cepstrum. An example shows that the waveform of the source and the path could be separated if a temporal windowing is properly applied.

Speech/Music Discrimination Using Mel-Cepstrum Modulation Energy (멜 켑스트럼 모듈레이션 에너지를 이용한 음성/음악 판별)

  • Kim, Bong-Wan;Choi, Dea-Lim;Lee, Yong-Ju
    • MALSORI
    • /
    • no.64
    • /
    • pp.89-103
    • /
    • 2007
  • In this paper, we introduce mel-cepstrum modulation energy (MCME) for a feature to discriminate speech and music data. MCME is a mel-cepstrum domain extension of modulation energy (ME). MCME is extracted on the time trajectory of Mel-frequency cepstral coefficients, while ME is based on the spectrum. As cepstral coefficients are mutually uncorrelated, we expect the MCME to perform better than the ME. To find out the best modulation frequency for MCME, we perform experiments with 4 Hz to 20 Hz modulation frequency. To show effectiveness of the proposed feature, MCME, we compare the discrimination accuracy with the results obtained from the ME and the cepstral flux.

  • PDF

On a Pitch Alteration Technique by Cepstrum Analysis of Flattened Excitation Spectrum (평탄화된 여기 스펙트럼에서 켑스트럼 피치 변경법에 관한 연구)

  • 조왕래
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.159-162
    • /
    • 1998
  • Speech synthesis coding is classified into three categories: waveform coding, source coding and hybrid coding. To obtain the synthetic speech with high quality, the synthesis by waveform coding is desired. However, it is difficult to apply waveform coding to synthesis by syllable or phoneme unit, because it does not divide the speech into excitation and formant component. Thus it is required to alter the excitation in waveform coding for applying waveform coding to synthesis by rule. In this paper we propose a new pitch alteration method that minimizes the spectrum distortion by using the behavior of cepstrum. This method splits the spectrum of speech signal into excitation spectrum and formant spectrum and transforms the excitation spectrum into cepstrum domain. The pitch of excitation cepstrum is altered by zero insertion or zero deletion and the pitch altered spectrum is reconstructed in spectrum domain. As a result of performance test, the average spectrum distortion was below 2.29%.

  • PDF

Speech Emotion Recognition using Feature Selection and Fusion Method (특징 선택과 융합 방법을 이용한 음성 감정 인식)

  • Kim, Weon-Goo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.8
    • /
    • pp.1265-1271
    • /
    • 2017
  • In this paper, the speech parameter fusion method is studied to improve the performance of the conventional emotion recognition system. For this purpose, the combination of the parameters that show the best performance by combining the cepstrum parameters and the various pitch parameters used in the conventional emotion recognition system are selected. Various pitch parameters were generated using numerical and statistical methods using pitch of speech. Performance evaluation was performed on the emotion recognition system using Gaussian mixture model(GMM) to select the pitch parameters that showed the best performance in combination with cepstrum parameters. As a parameter selection method, sequential feature selection method was used. In the experiment to distinguish the four emotions of normal, joy, sadness and angry, fifteen of the total 56 pitch parameters were selected and showed the best recognition performance when fused with cepstrum and delta cepstrum coefficients. This is a 48.9% reduction in the error of emotion recognition system using only pitch parameters.

New Data Extraction Method using the Difference in Speaker Recognition (화자인식에서 차분을 이용한 새로운 데이터 추출 방법)

  • Seo, Chang-Woo;Ko, Hee-Ae;Lim, Yong-Hwan;Choi, Min-Jung;Lee, Youn-Jeong
    • Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.7-15
    • /
    • 2008
  • This paper proposes the method to extract new feature vectors using the difference between the cepstrum for static characteristics and delta cepstrum for dynamic characteristics in speaker recognition (SR). The difference vector (DV) which it proposes from this paper is containing the static and the dynamic characteristics simultaneously at the intermediate characteristic vector which uses the deference between the static and the dynamic characteristics and as the characteristic vector which is new there is a possibility of doing. Compared to the conventional method, the proposed method can achieve new feature vector without increasing of new parameter, but only need the calculation process for the difference between the cepstrum and delta cepstrum. Experimental results show that the proposed method has a good performance more than 2.03%, on average, compared with conventional method in speaker identification (SI).

  • PDF

Performance Evaluation of Novel AMDF-Based Pitch Detection Scheme

  • Kumar, Sandeep
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.425-434
    • /
    • 2016
  • A novel average magnitude difference function (AMDF)-based pitch detection scheme (PDS) is proposed to achieve better performance in speech quality. A performance evaluation of the proposed PDS is carried out through both a simulation and a real-time implementation of a speech analysis-synthesis system. The parameters used to compare the performance of the proposed PDS with that of PDSs that are based on either a cepstrum, an autocorrelation function (ACF), an AMDF, or circular AMDF (CAMDF) methods are as follows: percentage gross pitch error (%GPE); a subjective listening test; an objective speech quality assessment; a speech intelligibility test; a synthesized speech waveform; computation time; and memory consumption. The proposed PDS results in lower %GPE and better synthesized speech quality and intelligibility for different speech signals as compared to the cepstrum-, ACF-, AMDF-, and CAMDF-based PDSs. The computational time of the proposed PDS is also less than that for the cepstrum-, ACF-, and CAMDF-based PDSs. Moreover, the total memory consumed by the proposed PDS is less than that for the ACF- and cepstrum-based PDSs.