Search | Korea Science

A Study-on Context-Dependent Acoustic Models to Improve the Performance of the Korea Speech Recognition (한국어 음성인식 성능향상을 위한 문맥의존 음향모델에 관한 연구)

황철준;오세진;김범국;정호열;정현열
- Journal of the Institute of Convergence Signal Processing
- /
- v.2 no.4
- /
- pp.9-15
- /
- 2001
In this paper we investigate context dependent acoustic models to improve the performance of the Korean speech recognition . The algorithm are using the Korean phonological rules and decision tree, By Successive State Splitting(SSS) algorithm the Hidden Merkov Netwwork(HM-Net) which is an efficient representation of phoneme-context-dependent HMMs, can be generated automatically SSS is powerful technique to design topologies of tied-state HMMs but it doesn't treat unknown contexts in the training phoneme contexts environment adequately In addition it has some problem in the procedure of the contextual domain. In this paper we adopt a new state-clustering algorithm of SSS, called Phonetic Decision Tree-based SSS (PDT-SSS) which includes contexts splits based on the Korean phonological rules. This method combines advantages of both the decision tree clustering and SSS, and can generated highly accurate HM-Net that can express any contexts To verify the effectiveness of the adopted methods. the experiments are carried out using KLE 452 word database and YNU 200 sentence database. Through the Korean phoneme word and sentence recognition experiments. we proved that the new state-clustering algorithm produce better phoneme, word and continuous speech recognition accuracy than the conventional HMMs.
PDF

Development of Feature Selection Method for Neural Network AE Signal Pattern Recognition and Its Application to Classification of Defects of Weld and Rotating Components (신경망 AE 신호 형상인식을 위한 특징값 선택법의 개발과 용접부 및 회전체 결함 분류에의 적용 연구)

Lee, Kang-Yong;Hwang, In-Bom
- Journal of the Korean Society for Nondestructive Testing
- /
- v.21 no.1
- /
- pp.46-53
- /
- 2001
The purpose of this paper is to develop a new feature selection method for AE signal classification. The neural network of back propagation algorithm is used. The proposed feature selection method uses the difference between feature coordinates in feature space. This method is compared with the existing methods such as Fisher's criterion, class mean scatter criterion and eigenvector analysis in terms of the recognition rate and the convergence speed, using the signals from the defects in welding zone of austenitic stainless steel and in the metal contact of the rotary compressor. The proposed feature selection methods such as 2-D and 3-D criteria showed better results in the recognition rate than the existing ones.
PDF

The suppression of noise-induced speech distortions for speech recognition (음성인식을 위한 잡음하의 음성왜곡제거)

Chi, Sang-Mun;Oh, Yung-Hwan
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.35S no.12
- /
- pp.93-102
- /
- 1998
In noisy environments, human speech productions are influenced by noises(Lombard effect), and speech signals are contaminated. These distortions dramatically reduce the performance of speech recognition systems. This paper proposes a method of the Lombard effect compensation and noise suppression in order to improve speech recognition performance in noise environments. To estimate the intensity of the Lombard effect which is a nonlinear distortion depending on the ambient noise levels, speakers, and phonetic units, we formulate the measure of the Lombard effect level based on the acoustic speech signal, and the measure is used to compensate the Lombard effect. The distortions of speech under noisy environments are cancelled out as follows. First, spectral subtraction and band-pass filtering are used to cancel out noise. Second, energy nomalization is proposed to cancel out the variation of vocal intensity by the Lombard effect. Finally, the Lombard effect level controls the transform which converts Lombard speech cepstrum to clean speech cepstrum. The proposed method was validated on 50 korean word recognition. Average recognition rates were 82.6%, 95.7%, 97.6% with the proposed method, while 46.3%, 75.5%, 87.4% without any compensation at SNR 0, 10, 20 dB, respectively.
PDF

A Study on the Optimization of State Tying Acoustic Models using Mixture Gaussian Clustering (혼합 가우시안 군집화를 이용한 상태공유 음향모델 최적화)

Ann, Tae-Ock
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.6
- /
- pp.167-176
- /
- 2005
This paper describes how the state tying model based on the decision tree which is one of Acoustic models used for speech recognition optimizes the model by reducing the number of mixture Gaussians of the output probability distribution. The state tying modeling uses a finite set of questions which is possible to include the phonological knowledge and the likelihood based decision criteria. And the recognition rate can be improved by increasing the number of mixture Gaussians of the output probability distribution. In this paper, we'll reduce the number of mixture Gaussians at the highest point of recognition rate by clustering the Gaussians. Bhattacharyya and Euclidean method will be used for the distance measure needed when clustering. And after calculating the mean and variance between the pair of lowest distance, the new Gaussians are created. The parameters for the new Gaussians are derived from the parameters of the Gaussians from which it is born. Experiments have been performed using the STOCKNAME (1,680) databases. And the test results show that the proposed method using Bhattacharyya distance measure maintains their recognition rate at $97.2\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$. And the method using Euclidean distance measure shows that it maintains the recognition rate at $96.9\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$. Then the methods can optimize the state tying model.
PDF KSCI

Classification of Defects in Rotary Compressor by Neural Pattern Recognition of Acoustic Emission Signal (AE신호의 신경망 형상인식법에 의한 로터리 압축기의 결함 분류에 관한 연구)

Lee, K.Y.;Lee, C.M.;Hwang, I.B.;Kim, Y.W.;Hong, J.K.
- Journal of the Korean Society for Nondestructive Testing
- /
- v.18 no.1
- /
- pp.17-26
- /
- 1998
The specimen with the wear between a roller and a vane and a normal specimen are classified by AE signal pattern recognition method with a neural network classifier in airconditioning operation test. Also the specimen with the scoring between a shaft and a bearing and a normal specimen are classified by the same method. As the internal pressure increases, the wear between the roller and the vane increases. The different pairs of oils and refrigerants five the effect on the wear.
PDF

HMM-based Music Identification System for Copyright Protection (저작권 보호를 위한 HMM기반의 음악 식별 시스템)

Kim, Hee-Dong;Kim, Do-Hyun;Kim, Ji-Hwan
- Phonetics and Speech Sciences
- /
- v.1 no.1
- /
- pp.63-67
- /
- 2009
In this paper, in order to protect music copyrights, we propose a music identification system which is scalable to the number of pieces of registered music and robust to signal-level variations of registered music. For its implementation, we define the new concepts of 'music word' and 'music phoneme' as recognition units to construct 'music acoustic models'. Then, with these concepts, we apply the HMM-based framework used in continuous speech recognition to identify the music. Each music file is transformed to a sequence of 39-dimensional vectors. This sequence of vectors is represented as ordered states with Gaussian mixtures. These ordered states are trained using Baum-Welch re-estimation method. Music files with a suspicious copyright are also transformed to a sequence of vectors. Then, the most probable music file is identified using Viterbi algorithm through the music identification network. We implemented a music identification system for 1,000 MP3 music files and tested this system with variations in terms of MP3 bit rate and music speed rate. Our proposed music identification system demonstrates robust performance to signal variations. In addition, scalability of this system is independent of the number of registered music files, since our system is based on HMM method.
PDF

Implementation of Chip and Algorithm of a Speech Enhancement for an Automatic Speech Recognition Applied to Telematics Device (텔레메틱스 단말용 음성 인식을 위한 음성향상 알고리듬 및 칩 구현)

Kim, Hyoung-Gook
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.7 no.5
- /
- pp.90-96
- /
- 2008
This paper presents an algorithm of a single chip acoustic speech enhancement for telematics device. The algorithm consists of two stages, i.e. noise reduction and echo cancellation. An adaptive filter based on cross spectral estimation is used to cancel echo. The external background noise is eliminated and the clear speech is estimated by using MMSE log-spectral magnitude estimation. To be suitable for use in consumer electronics, we also design a low cost, high speed and flexible hardware architecture. The performance of the proposed speech enhancement algorithms were measured both by the signal-to-noise ratio(SNR) and recognition accuracy of an automatic speech recognition(ASR) and yields better results compared with the conventional methods.
PDF

Identification of failure mechanisms for CFRP-confined circular concrete-filled steel tubular columns through acoustic emission signals

Li, Dongsheng;Du, Fangzhu;Chen, Zhi;Wang, Yanlei
- Smart Structures and Systems
- /
- v.18 no.3
- /
- pp.525-540
- /
- 2016
The CFRP-confined circular concrete-filled steel tubular column is composed of concrete, steel, and CFRP. Its failure mechanics are complex. The most important difficulties are lack of an available method to establish a relationship between a specific damage mechanism and its acoustic emission (AE) characteristic parameter. In this study, AE technique was used to monitor the evolution of damage in CFRP-confined circular concrete-filled steel tubular columns. A fuzzy c-means method was developed to determine the relationship between the AE signal and failure mechanisms. Cluster analysis results indicate that the main AE sources include five types: matrix cracking, debonding, fiber fracture, steel buckling, and concrete crushing. This technology can not only totally separate five types of damage sources, but also make it easier to judge the damage evolution process. Furthermore, typical damage waveforms were analyzed through wavelet analysis based on the cluster results, and the damage modes were determined according to the frequency distribution of AE signals.
https://doi.org/10.12989/sss.2016.18.3.525 인용 KSCI

Recognition of damage pattern and evolution in CFRP cable with a novel bonding anchorage by acoustic emission

Wu, Jingyu;Lan, Chengming;Xian, Guijun;Li, Hui
- Smart Structures and Systems
- /
- v.21 no.4
- /
- pp.421-433
- /
- 2018
Carbon fiber reinforced polymer (CFRP) cable has good mechanical properties and corrosion resistance. However, the anchorage of CFRP cable is a big issue due to the anisotropic property of CFRP material. In this article, a high-efficient bonding anchorage with novel configuration is developed for CFRP cables. The acoustic emission (AE) technique is employed to evaluate the performance of anchorage in the fatigue test and post-fatigue ultimate bearing capacity test. The obtained AE signals are analyzed by using a combination of unsupervised K-means clustering and supervised K-nearest neighbor classification (K-NN) for quantifying the performance of the anchorage and damage evolutions. An AE feature vector (including both frequency and energy characteristics of AE signal) for clustering analysis is proposed and the under-sampling approaches are employed to regress the influence of the imbalanced classes distribution in AE dataset for improving clustering quality. The results indicate that four classes exist in AE dataset, which correspond to the shear deformation of potting compound, matrix cracking, fiber-matrix debonding and fiber fracture in CFRP bars. The AE intensity released by the deformation of potting compound is very slight during the whole loading process and no obvious premature damage observed in CFRP bars aroused by anchorage effect at relative low stress level, indicating the anchorage configuration in this study is reliable.
https://doi.org/10.12989/sss.2018.21.4.421 인용 KSCI

Optimize the Acoustic Environment Using a Sound Masking Effects of the Audio Signal Compression Principle (음성신호의 압축원리를 이용한 사운드 마스킹 효과로 음향 환경 최적화)

Ann, Sook-Hyang
- Journal of the Korean Institute of Electrical and Electronic Material Engineers
- /
- v.28 no.11
- /
- pp.748-751
- /
- 2015
Sound Masking System technology as by sound the same on all bands and artificially generates a constant sound shield People want to hear or recognize the people with the noise generated from the interior of the way. Prevent hearing or prevent recognition by using the technology to control the audible frequency band Continue to emit constant and uniform shielding sound audible frequency band Even the security content of speech (20 Hz~20 KHz). That interception laser eavesdropping, internal solicitations, during recording Or delay the decoding was a result of the effect of interference calculated Experience noise disturbance index is applied around the Stress Index is the average index is 10.16 was a luxury for the average index is then applied to the index 3.07 Noise is significantly lower stress level has improved noise conditions.
https://doi.org/10.4313/JKEM.2015.28.11.748 인용 PDF KSCI

Search Result 70, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)