Search | Korea Science

A nonlinear transformation methods for GMM to improve over-smoothing effect

Chae, Yi Geun
- Journal of Advanced Marine Engineering and Technology
- /
- v.38 no.2
- /
- pp.182-187
- /
- 2014
We propose nonlinear GMM-based transformation functions in an attempt to deal with the over-smoothing effects of linear transformation for voice processing. The proposed methods adopt RBF networks as a local transformation function to overcome the drawbacks of global nonlinear transformation functions. In order to obtain high-quality modifications of speech signals, our voice conversion is implemented using the Harmonic plus Noise Model analysis/synthesis framework. Experimental results are reported on the English corpus, MOCHA-TIMIT.
https://doi.org/10.5916/jkosme.2014.38.2.182 인용 PDF KSCI

On Predictive Coding of Speech Signals (음성신호의 예측부호화에 관하여)

은종관
- The Magazine of the IEIE
- /
- v.12 no.5
- /
- pp.23-35
- /
- 1985
본 논문은 디지털 음성통신에서 사용되는 예측부호화(predictive coding) 방식에 관하여 기술하고 있다. 특히 전송속도가 16∼48kbit/s 대역에서 많이 사용하고 있는 adaptive differential pulse code modulation(ADPCM)과 adaptive delta modulation(ADM)에 관하여 중점적으로 토의한다. 또한 variable-rate ADPCM과 ADM에 관해서 기술하고, 이들 시스템의 noisy channel에서의 효과 및 성능개선방법, 그리고 PCM과의 transcoding에서의 문제점 등을 통의한다. ADPCM은 최근 CCITT에서의 표준화 결과로 앞으로 PCM과 함께 많이 쓰여질 전망이며, ADM은 시스템이 간단하고 또한 channel error에 강한 이유로 특수통신에 많이 쓰여질 것이다.
PDF

Digital Processing of Speech Signals (음성 신호의 디지털 신호처리)

김진현
- Proceedings of the KSLP Conference
- /
- 1995.11a
- /
- pp.103-110
- /
- 1995
디지털이라는 말은 우리 일상 생활에서 흔히 듣는다. 시계, 체온계, 체중계, 자동차의 속도계, 혈압계 등 요즘에는 디지털 표시 제품이 아주 많아 졌다 디지털이라는 말을 영어사전에서 찾아보면 '손가락의', '계수형의' 등의 뜻으로 쓰여져 있다. 그리고 측정분야에서 디지털이라는 말은 '이산적', '불연속적' 이라는 뜻이 있으며, 이것은 값이 드문드문 있다는 의미이며, 디지털의 특징으로 미리 정해진 자리수로만 값을 표현할 수 있는 것을 뜻한다. 디지털에 대해 반대 의미를 갖는 것이 아날로그이다. (중략)
PDF

A General Analysis and Complexity Reduction for the Lattice Transversal Joint Adaptive Filter

Yoo, Jae-Ha
- Proceedings of the IEEK Conference
- /
- 2002.07c
- /
- pp.2035-2038
- /
- 2002
The necessity of the filter coefficients compensation for the LTJ adaptive filter was explained generally and easily by analyzing it with respect to the time-varying transform domain adaptive filter. And also the reduction method of computational complexity for filter coefficients compensation was proposed and its effectiveness was verified through experiments using artificial and real speech signals. The proposed adaptive filter reduces the computational complexity for filter coefficients compensation by 95%, and when the filter is applied to the acoustic echo canceller with 1000 taps, the total complexity is reduced by 82%
PDF

Modeling of Speech Signals Using Segmental-Features (분절 특징을 이용한 음성 신호의 모델링)

윤영선;오영환
- Proceedings of the Korean Information Science Society Conference
- /
- 2000.10b
- /
- pp.371-373
- /
- 2000
본 논문에서는 분절 특징을 모수적 궤적 모델을 이용하여 표현하고, 이 특징을 분절 HMM(segmental HMM)의 입력으로 하는 음성 신호의 모델링 방식을 제안한다. 분절 특징은 음성의 경향을 나타내는 궤적으로 표현되고, 그 궤적은 연속되는 프레임 상에서 전이 정보를 포함하도록 디자인 행렬과 다항식의 회귀 함수를 이용하여 구해진다. 이 궤적을 분절 HMM에 적용하기 위하여, 외적 분절 변이와 내적 분절 변이에 대한 확률 분포 표현을 개선하였다. 제안된 방법의 효과를 살펴보기 위하여 TIMIT 데이터 베이스를 이용하여 실험한 결과, 제안된 분절 특징은 음성 신호의 인접한 프레임간의 상관관계를 표현하는 동적 특징과 같은 효과를 보였으며, 1차 미분계수를 포함하여 분절 특징을 구한 경우에는 기존의 특징 표현보다 좋은 성능을 보였다.
PDF

A Study on Discrete Hidden Markov Model for Vibration Monitoring and Diagnosis of Turbo Machinery (터보회전기기의 진동모니터링 및 진단을 위한 이산 은닉 마르코프 모델에 관한 연구)

Lee, Jong-Min;Hwang, Yo-ha;Song, Chang-Seop
- The KSFM Journal of Fluid Machinery
- /
- v.7 no.2 s.23
- /
- pp.41-49
- /
- 2004
Condition monitoring is very important in turbo machinery because single failure could cause critical damages to its plant. So, automatic fault recognition has been one of the main research topics in condition monitoring area. We have used a relatively new fault recognition method, Hidden Markov Model(HMM), for mechanical system. It has been widely used in speech recognition, however, its application to fault recognition of mechanical signal has been very limited despite its good potential. In this paper, discrete HMM(DHMM) was used to recognize the faults of rotor system to study its fault recognition ability. We set up a rotor kit under unbalance and oil whirl conditions and sampled vibration signals of two failure conditions. DHMMS of each failure condition were trained using sampled signals. Next, we changed the setup and the rotating speed of the rotor kit. We sampled vibration signals and each DHMM was applied to these sampled data. It was found that DHMMs trained by data of one rotating speed have shown good fault recognition ability in spite of lack of training data, but DHMMs trained by data of four different rotating speeds have shown better robustness.
https://doi.org/10.5293/KFMA.2004.7.2.041 인용 PDF KSCI

Classification of Doppler Audio Signals for Moving Target Using Hidden Markov Model in Pulse Doppler Radar (펄스 도플러 레이더에서 HMM을 이용한 이동표적의 도플러 오디오 신호 식별)

Sim, Jae-Hun;Lee, Jung-Ho;Bae, Keun-Sung
- Journal of IKEEE
- /
- v.22 no.3
- /
- pp.624-629
- /
- 2018
Classification of moving targets in Pulse Doppler Radar(PDR) for surveillance and reconnaissance purposes is generally carried out based on listening and training experience of Doppler audio signals by radar operator. In this paper, we proposed the automatic classification method to identify the class of moving target with Doppler audio signals using the Mel Frequency Cepstral Coefficients(MFCC) and the Hidden Markov Model(HMM) algorithm which are widely used in speech recognition and the classification performance was analyzed and verified by simulations.
https://doi.org/10.7471/ikeee.2018.22.3.624 인용 PDF KSCI

Implementation of Music Signals Discrimination System for FM Broadcasting (FM 라디오 환경에서의 실시간 음악 판별 시스템 구현)

Kang, Hyun-Woo
- The KIPS Transactions:PartB
- /
- v.16B no.2
- /
- pp.151-156
- /
- 2009
This paper proposes a Gaussian mixture model(GMM)-based music discrimination system for FM broadcasting. The objective of the system is automatically archiving music signals from audio broadcasting programs that are normally mixed with human voices, music songs, commercial musics, and other sounds. To improve the system performance, make it more robust and to accurately cut the starting/ending-point of the recording, we also added a post-processing module. Experimental results on various input signals of FM radio programs under PC environments show excellent performance of the proposed system. The fixed-point simulation shows the same results under 3MIPS computational power.
https://doi.org/10.3745/KIPSTB.2009.16-B.2.151 인용 PDF KSCI

Blind Audio Source Separation Based On High Exploration Particle Swarm Optimization

KHALFA, Ali;AMARDJIA, Nourredine;KENANE, Elhadi;CHIKOUCHE, Djamel;ATTIA, Abdelouahab
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.5
- /
- pp.2574-2587
- /
- 2019
Blind Source Separation (BSS) is a technique used to separate supposed independent sources of signals from a given set of observations. In this paper, the High Exploration Particle Swarm Optimization (HEPSO) algorithm, which is an enhancement of the Particle Swarm Optimization (PSO) algorithm, has been used to separate a set of source signals. Compared to PSO algorithm, HEPSO algorithm depends on two additional operators. The first operator is based on the multi-crossover mechanism of the genetic algorithm while the second one relies on the bee colony mechanism. Both operators have been employed to update the velocity and the position of the particles respectively. Thus, they are used to find the optimal separating matrix. The proposed method enhances the overall efficiency of the standard PSO in terms of good exploration and performance. Based on many tests realized on speech and music signals supplied by the BSS demo, experimental results confirm the robustness and the accuracy of the introduced BSS technique.
https://doi.org/10.3837/tiis.2019.05.019 인용 PDF KSCI HTML

The Design of Keyword Spotting System based on Auditory Phonetical Knowledge-Based Phonetic Value Classification (청음 음성학적 지식에 기반한 음가분류에 의한 핵심어 검출 시스템 구현)

Kim, Hack-Jin;Kim, Soon-Hyub
- The KIPS Transactions:PartB
- /
- v.10B no.2
- /
- pp.169-178
- /
- 2003
This study outlines two viewpoints the classification of phone likely unit (PLU) which is the foundation of korean large vocabulary speech recognition, and the effectiveness of Chiljongseong (7 Final Consonants) and Paljogseong (8 Final Consonants) of the korean language. The phone likely classifies the phoneme phonetically according to the location of and method of articulation, and about 50 phone-likely units are utilized in korean speech recognition. In this study auditory phonetical knowledge was applied to the classification of phone likely unit to present 45 phone likely unit. The vowels 'ㅔ, ㅐ'were classified as phone-likely of (ee) ; 'ㅒ, ㅖ' as [ye] ; and 'ㅚ, ㅙ, ㅞ' as [we]. Secondly, the Chiljongseong System of the draft for unified spelling system which is currently in use and the Paljongseonggajokyong of Korean script haerye were illustrated. The question on whether the phonetic value on 'ㄷ' and 'ㅅ' among the phonemes used in the final consonant of the korean fan guage is the same has been argued in the academic world for a long time. In this study, the transition stages of Korean consonants were investigated, and Ciljonseeng and Paljongseonggajokyong were utilized in speech recognition, and its effectiveness was verified. The experiment was divided into isolated word recognition and speech recognition, and in order to conduct the experiment PBW452 was used to test the isolated word recognition. The experiment was conducted on about 50 men and women - divided into 5 groups - and they vocalized 50 words each. As for the continuous speech recognition experiment to be utilized in the materialized stock exchange system, the sentence corpus of 71 stock exchange sentences and speech corpus vocalizing the sentences were collected and used 5 men and women each vocalized a sentence twice. As the result of the experiment, when the Paljongseonggajokyong was used as the consonant, the recognition performance elevated by an average of about 1.45% : and when phone likely unit with Paljongseonggajokyong and auditory phonetic applied simultaneously, was applied, the rate of recognition increased by an average of 1.5% to 2.02%. In the continuous speech recognition experiment, the recognition performance elevated by an average of about 1% to 2% than when the existing 49 or 56 phone likely units were utilized.
https://doi.org/10.3745/KIPSTB.2003.10B.2.169 인용 PDF KSCI

Search Result 499, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)