• Title/Summary/Keyword: speech signals

Search Result 499, Processing Time 0.018 seconds

On a Detection of the ZCR-Parameter for Higher Formants of Speech Signals (음성신호의 상위 포만트에 대한 ZCR-파라미터 검출에 관한 연구)

  • 유건수
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1992.06a
    • /
    • pp.49-53
    • /
    • 1992
  • In many applications such as speech analysis, speech coding, speech recognition, etc., the voiced-unvoiced decision should be performed correctly for efficient processing. One of the parameters which are used for voice-unvoiced decision is zero-crossing. But the information of higher formants have not represented as the zero-crossing rate for higher formants of speech signals.

  • PDF

Speech Encryption Scheme Using Frequency Band Scrambling (대역 스크램블을 이용한 음성 보호방식)

  • Ji, Hyung-Kun;Lee, Dong-Wook
    • Proceedings of the KIEE Conference
    • /
    • 1999.11c
    • /
    • pp.700-702
    • /
    • 1999
  • The protection of data which we want to keep secret from invalid users has become a main topic nowadays. This paper introduces a encryption scheme for protecting speech signals from eavesdropping. The proposed encryption scheme adopts a secure voice cryptographic algorithm based on the scrambling in frequency band. In order to improve the conventional speech signal encryption scheme, we have randomly permuted DCT coefficients of speech signal. Simulation results are included to show the performance of the proposed algorithm for secure transmission of speech signals.

  • PDF

Realization of Variable Bandwidth Filter for Decomposition of Speech Signals into AM-FM Components (음성신호의 AM-FM 성분 분리를 위한 가변대역폭 필터 구현)

  • 이희영;김용태
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2208-2211
    • /
    • 2003
  • In this paper, a variable bandwidth filter(VBF) is realized with the purpose of the decomposition of speech signals with time-varying instantaneous of frequencies. The proposed VBF can extract AM-FM components of a speech signal whose time-frequency representations(TFRs) are not overlapped in time-frequency domain

  • PDF

Noise Suppression Method for Restoring Line Spectrum Pair (선스펙트럼 쌍의 복원에 의한 잡음억제 기법)

  • Choi, Jae-Seung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.4
    • /
    • pp.112-118
    • /
    • 2010
  • This paper describes a noise suppression system based on a normalization method using a time-delay neural network and line spectrum pair having a parameter of frequency domain. First, a time-delay neural network is trained using line spectrum pair values of noisy speech signals obtained by linear prediction analysis. After trained the time-delay neural network, the proposed system enhances speech signals that are degraded by a background noise. Accordingly, the proposed time-delay neural network restores from the line spectrum pair values of noisy speech signals to the line spectrum pair values of clean speech signals. It is confirmed that this system is effective for speech signals degraded by a background noise, judging from spectral distortion measurement.

Performance of music section detection in broadcast drama contents using independent component analysis and deep neural networks (ICA와 DNN을 이용한 방송 드라마 콘텐츠에서 음악구간 검출 성능)

  • Heo, Woon-Haeng;Jang, Byeong-Yong;Jo, Hyeon-Ho;Kim, Jung-Hyun;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.19-29
    • /
    • 2018
  • We propose to use independent component analysis (ICA) and deep neural network (DNN) to detect music sections in broadcast drama contents. Drama contents mainly comprise silence, noise, speech, music, and mixed (speech+music) sections. The silence section is detected by signal activity detection. To detect the music section, we train noise, speech, music, and mixed models with DNN. In computer experiments, we used the MUSAN corpus for training the acoustic model, and conducted an experiment using 3 hours' worth of Korean drama contents. As the mixed section includes music signals, it was regarded as a music section. The segmentation error rate (SER) of music section detection was observed to be 19.0%. In addition, when stereo mixed signals were separated into music signals using ICA, the SER was reduced to 11.8%.

Segmentation of the Korean speech signals into phonetic units using the super resolution pitch determination (고해상 피치검출을 이용한 한국어 음성신호의 음소분리)

  • 이응구;이두수
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.2
    • /
    • pp.270-278
    • /
    • 1993
  • This paper is presented the phonetic segmentation alg9rithm of the Korean speech signals which is finded the exact pitch using the super resoluton pitch determination and is compared corss-correlation to threshold each pitch period. The features of the proposed algorithm are infinite resolution and high reliability, and also can separate transient or silent segment. The algorithm is instrumental to speech processing applications which require vector quantization and speech recognition. The presented algorithm is implemented by 386-MATLAB on PC 386/DX and is verified the exact pitch period and the phonetic segmentation of speech signals.

  • PDF

Comparison of ICA Methods for the Recognition of Corrupted Korean Speech (잡음 섞인 한국어 인식을 위한 ICA 비교 연구)

  • Kim, Seon-Il
    • 전자공학회논문지 IE
    • /
    • v.45 no.3
    • /
    • pp.20-26
    • /
    • 2008
  • Two independent component analysis(ICA) algorithms were applied for the recognition of speech signals corrupted by a car engine noise. Speech recognition was performed by hidden markov model(HMM) for the estimated signals and recognition rates were compared with those of orginal speech signals which are not corrupted. Two different ICA methods were applied for the estimation of speech signals, one of which is FastICA algorithm that maximizes negentropy, the other is information-maximization approach that maximizes the mutual information between inputs and outputs to give maximum independence among outputs. Word recognition rate for the Korean news sentences spoken by a male anchor is 87.85%, while there is 1.65% drop of performance on the average for the estimated speech signals by FastICA and 2.02% by information-maximization for the various signal to noise ratio(SNR). There is little difference between the methods.

The Pitch Detection Using Variable LPF (Variable LPF에 의한 피치검출)

  • 백금란
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1993.06a
    • /
    • pp.88-92
    • /
    • 1993
  • In speech signal processing, it is necessary to detect exactly the pitch. The algorithms of pitch extraction which have been proposed until now are difficult to detect pitches over wide range speech signals. Thus we propose a new algorithm which uses the G-peak extraction to do it. It is the method that finds the most MZI(maximum zero-crossing interval) at each frame and convolve it with speech signal ; this is the same with passing speech signals to variable LPF. Finally we obtained the pitch, improve the accuracy of pitch detection and extract it with the high speed.

  • PDF

Evaluation for speech signal based on human sense and signal quality

  • Mekada, Yoshito;Hasegawa, Hiroshi;Kumagai, Takeshi;Kasuga, Masao
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1997.06a
    • /
    • pp.13-18
    • /
    • 1997
  • Each reproducing speech signal has each particular signal property, because of the processing of encoding and decoding for communications through various media. In this paper, we examine the correlation between speech signal quality and sensory pleasure for the sensory improvement of that signal. In experiments, we evaluate the quality of speech signals through various media by psychological auditory test and physical features of these signals.

  • PDF

Classification of Speech and Car Noise Signals using the Slope of Autocovariances in Frequency Domain (주파수 영역 자기 공분산 기울기를 이용한 음성과 자동차 소음 신호의 구분)

  • Kim, Seon-Il
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.10
    • /
    • pp.2093-2099
    • /
    • 2011
  • Speech signal and car noise signal such as muffler noise are segregated from the one which has both signals mixed using statistical method. To classify speech signal from the other in segregated signals, FFT coefficients were obtained for all segments of a signal where each segment consists of 128 elements of a signal. For several coefficients of FFT corresponding to the low frequencies of a signal, autocovariances are calculated between coefficients of same order of all segments of a signal. Then they were averaged over autocovariances. Linear equation was eatablished for the those autocovariances using the linear regression method for each siganl. The coefficient of the slope of the line gives reference to compare and decide what the speech signal is. It is what this paper proposes. The results show it is very useful.