Search | Korea Science

A Spectral Compensation Method for Noise Robust Speech Recognition (잡음에 강인한 음성인식을 위한 스펙트럼 보상 방법)

Cho, Jung-Ho
- 전자공학회논문지 IE
- /
- v.49 no.2
- /
- pp.9-17
- /
- 2012
One of the problems on the application of the speech recognition system in the real world is the degradation of the performance by acoustical distortions. The most important source of acoustical distortion is the additive noise. This paper describes a spectral compensation technique based on a spectral peak enhancement scheme followed by an efficient noise subtraction scheme for noise robust speech recognition. The proposed methods emphasize the formant structure and compensate the spectral tilt of the speech spectrum while maintaining broad-bandwidth spectral components. The recognition experiments was conducted using noisy speech corrupted by white Gaussian noise, car noise, babble noise or subway noise. The new technique reduced the average error rate slightly under high SNR(Signal to Noise Ratio) environment, and significantly reduced the average error rate by 1/2 under low SNR(10 dB) environment when compared with the case of without spectral compensations.
PDF KSCI

Investigating the Effects of Hearing Loss and Hearing Aid Digital Delay on Sound-Induced Flash Illusion

Moradi, Vahid;Kheirkhah, Kiana;Farahani, Saeid;Kavianpour, Iman
- Journal of Audiology & Otology
- /
- v.24 no.4
- /
- pp.174-179
- /
- 2020
Background and Objectives: The integration of auditory-visual speech information improves speech perception; however, if the auditory system input is disrupted due to hearing loss, auditory and visual inputs cannot be fully integrated. Additionally, temporal coincidence of auditory and visual input is a significantly important factor in integrating the input of these two senses. Time delayed acoustic pathway caused by the signal passing through digital signal processing. Therefore, this study aimed to investigate the effects of hearing loss and hearing aid digital delay circuit on sound-induced flash illusion. Subjects and Methods: A total of 13 adults with normal hearing, 13 with mild to moderate hearing loss, and 13 with moderate to severe hearing loss were enrolled in this study. Subsequently, the sound-induced flash illusion test was conducted, and the results were analyzed. Results: The results showed that hearing aid digital delay and hearing loss had no detrimental effect on sound-induced flash illusion. Conclusions: Transmission velocity and neural transduction rate of the auditory inputs decreased in patients with hearing loss. Hence, the integrating auditory and visual sensory cannot be combined completely. Although the transmission rate of the auditory sense input was approximately normal when the hearing aid was prescribed. Thus, it can be concluded that the processing delay in the hearing aid circuit is insufficient to disrupt the integration of auditory and visual information.
https://doi.org/10.7874/jao.2019.00507 인용

Acoustic Monitoring and Localization for Social Care

Goetze, Stefan;Schroder, Jens;Gerlach, Stephan;Hollosi, Danilo;Appell, Jens-E.;Wallhoff, Frank
- Journal of Computing Science and Engineering
- /
- v.6 no.1
- /
- pp.40-50
- /
- 2012
Increase in the number of older people due to demographic changes poses great challenges to the social healthcare systems both in the Western and as well as in the Eastern countries. Support for older people by formal care givers leads to enormous temporal and personal efforts. Therefore, one of the most important goals is to increase the efficiency and effectiveness of today's care. This can be achieved by the use of assistive technologies. These technologies are able to increase the safety of patients or to reduce the time needed for tasks that do not relate to direct interaction between the care giver and the patient. Motivated by this goal, this contribution focuses on applications of acoustic technologies to support users and care givers in ambient assisted living (AAL) scenarios. Acoustic sensors are small, unobtrusive and can be added to already existing care or living environments easily. The information gathered by the acoustic sensors can be analyzed to calculate the position of the user by localization and the context by detection and classification of acoustic events in the captured acoustic signal. By doing this, possibly dangerous situations like falls, screams or an increased amount of coughs can be detected and appropriate actions can be initialized by an intelligent autonomous system for the acoustic monitoring of older persons. The proposed system is able to reduce the false alarm rate compared to other existing and commercially available approaches that basically rely only on the acoustic level. This is due to the fact that it explicitly distinguishes between the various acoustic events and provides information on the type of emergency that has taken place. Furthermore, the position of the acoustic event can be determined as contextual information by the system that uses only the acoustic signal. By this, the position of the user is known even if she or he does not wear a localization device such as a radio-frequency identification (RFID) tag.
https://doi.org/10.5626/JCSE.2012.6.1.40 인용 PDF KSCI KPUBS

Noise Elimination Using Improved MFCC and Gaussian Noise Deviation Estimation

Sang-Yeob, Oh
- Journal of the Korea Society of Computer and Information
- /
- v.28 no.1
- /
- pp.87-92
- /
- 2023
With the continuous development of the speech recognition system, the recognition rate for speech has developed rapidly, but it has a disadvantage in that it cannot accurately recognize the voice due to the noise generated by mixing various voices with the noise in the use environment. In order to increase the vocabulary recognition rate when processing speech with environmental noise, noise must be removed. Even in the existing HMM, CHMM, GMM, and DNN applied with AI models, unexpected noise occurs or quantization noise is basically added to the digital signal. When this happens, the source signal is altered or corrupted, which lowers the recognition rate. To solve this problem, each voice In order to efficiently extract the features of the speech signal for the frame, the MFCC was improved and processed. To remove the noise from the speech signal, the noise removal method using the Gaussian model applied noise deviation estimation was improved and applied. The performance evaluation of the proposed model was processed using a cross-correlation coefficient to evaluate the accuracy of speech. As a result of evaluating the recognition rate of the proposed method, it was confirmed that the difference in the average value of the correlation coefficient was improved by 0.53 dB.
https://doi.org/10.9708/jksci.2023.28.01.087 인용 PDF HTML

Real-time implementation of the 2.4kbps EHSX Speech Coder Using a $TMS320C6701^TM$ DSPCore ($TMS320C6701^TM$을 이용한 2.4kbps EHSX 음성 부호화기의 실시간 구현)

양용호;이인성;권오주
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.7C
- /
- pp.962-970
- /
- 2004
This paper presents an efficient implementation of the 2.4 kbps EHSX(Enhanced Harmonic Stochastic Excitation) speech coder on a TMS320C6701$^{TM}$ floating-point digital signal processor. The EHSX speech codec is based on a harmonic and CELP(Code Excited Linear Prediction) modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. In this paper, we represent the optimization methods to reduce the complexity for real-time implementation. The complexity in the filtering of a CELP algorithm that is the main part for the EHSX algorithm complexity can be reduced by converting program using floating-point variable to program using fixed-point variable. We also present the efficient optimization methods including the code allocation considering a DSP architecture and the low complexity algorithm of harmonic/pitch search in encoder part. Finally, we obtained the subjective quality of MOS 3.28 from speech quality test using the PESQ(perceptual evaluation of speech quality), ITU-T Recommendation P.862 and could get a goal of realtime operation of the EHSX codec.c.
PDF KSCI

Complexity Reduction Algorithm of Speech Coder(EVRC) for CDMA Digital Cellular System

Min, So-Yeon
- Journal of Korea Multimedia Society
- /
- v.10 no.12
- /
- pp.1551-1558
- /
- 2007
The standard of evaluating function of speech coder for mobile telecommunication can be shown in channel capacity, noise immunity, encryption, complexity and encoding delay largely. This study is an algorithm to reduce complexity applying to CDMA(Code Division Multiple Access) mobile telecommunication system, which has a benefit of keeping the existing advantage of telecommunication quality and low transmission rate. This paper has an objective to reduce the computing complexity by controlling the frequency band nonuniform during the changing process of LSP(Line Spectrum Pairs) parameters from LPC(Line Predictive Coding) coefficients used for EVRC(Enhanced Variable-Rate Coder, IS-127) speech coders. Its experimental result showed that when comparing the speech coder applied by the proposed algorithm with the existing EVRC speech coder, it's decreased by 45% at average. Also, the values of LSP parameters, Synthetic speech signal and Spectrogram test result were obtained same as the existing method.
PDF

Design and Implementation of Speech-Training System for Voice Disorders (발성장애아동을 위한 발성훈련시스템 설계 및 구현)

정은순;김봉완;양옥렬;이용주
- Journal of Internet Computing and Services
- /
- v.2 no.1
- /
- pp.97-106
- /
- 2001
In this paper, we design and implement complement based speech training system for voice disorder. The system consists of three level of training: precedent training, training for speech apprehension and training for speech enhancement. To analyze speech of voice disorder, we extracted speech features as loudness, amplitude, pitch using digital signal processing technique. Extracted features are converted to graphic interface for visual feedback of speech by the system.
PDF

Post Processing using Blind Signal Separation in Stereo Acoustic Echo Canceller (스테레오 음향반향제거기의 BSS 후처리방법)

Lee, Haeng Woo
- Journal of Korea Society of Digital Industry and Information Management
- /
- v.10 no.1
- /
- pp.131-138
- /
- 2014
This paper is on a stereo acoustic echo canceller with the blind signal separation for post processing. The convergence speed of the stereo acoustic echo canceller is deteriorated due to mixing two residual signals which are update signals of each echo canceller. To solve this problem, we are to use the blind signal separation(BSS) method separating the mixed signals after the echo cancellers. The blind signal separation method can extracts the source signals by means of the iterative computations with two input signals. We had verified performances of the proposed acoustic echo canceller for stereo through simulations. The results of simulations show that the acoustic echo canceller for stereo using this algorithm operates stably without divergence in the normal state. And, when the speech signals were inputted, this echo canceller achieved about 2dB higher ERLE with the BSS post processing method than without this method. This stereo echo canceller showed the best performance in the case of inputting the real voice signal.
https://doi.org/10.17662/ksdim.2014.10.1.131 인용 KSCI

An Acoustic Echo Canceller for Stereo Using Blind Signal Separation (암묵신호분리를 이용한 스테레오 음향반향제거기)

Lee, Haeng Woo
- Journal of Korea Society of Digital Industry and Information Management
- /
- v.8 no.3
- /
- pp.125-131
- /
- 2012
This paper is on a stereo acoustic echo canceller with the blind signal separation. The convergence speed of the stereo acoustic echo canceller is deteriorated due to mixing two residual signals in the update signal of each echo canceller. To solve this problem, we are to use the blind signal separation(BSS) method separating the mixed signals. The blind signal separation method can extracts the source signals by means of the iterative computations with two input signals. We had verified performances of the proposed acoustic echo canceller for stereo through simulations. The results of simulations show that the acoustic echo canceller for stereo using this algorithm operates stably without divergence in the normal state. And, when the speech signals were inputted, this echo canceller achieved about 3dB higher ERLE in the case of using the BSS algorithm than the case of not using the BSS algorithm. But this echo canceller didn't get good performances in the case of inputting the white noises as stereo signals.
KSCI

A Study on the Performance of Noise Reduction using Multi-Microphones for Digital Hearing Aids (디지털 보청기를 위한 다중 마이크로폰을 이용한 잡음제거 성능 연구)

Kang, Hyun-Deok;Song, Young-Rok;Lee, Sang-Min
- Journal of IKEEE
- /
- v.14 no.1
- /
- pp.47-54
- /
- 2010
In this study, we analyzed the reduction of noise in a noise environment using 2, 3, 4 or 5 microphones in digital hearing aids. In order to be able to use this in actual digital hearing aids, we made the experiment microphone set similar to the behind-the-ear type (BTE) and then recorded the signal accordingly, with each situation. With the recorded signals, we reduced the noise in each signal by a noise reduction algorithm using multi-microphones. As a result, in the case of By comparing the SNR (Signal to Noise Ratio) and PESQ (Perceptual Evaluation of Speech) measurements, before and after the noise reduction, the results showed that the improvement in performance was highest when three or four microphones were used. Generally, when two or more microphones were used, we found that as the number of microphones increased there was an increase in performance.
PDF KSCI

Search Result 136, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)