Search | Korea Science

A simulation study of speech perception enhancement for cochlear implant patients using companding in noisy environment (잡음 환경에서 압신을 이용한 인공 와우 환자의 언어 인지 향상 시뮬레이션 연구)

Lee Young-Woo;Ji Yoon-Sang;Lee Jong-Shil;Kim In-Young;Kim Sun-I.;Hong Sung-Hwa;Lee Sang-Min
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.43 no.5 s.311
- /
- pp.79-87
- /
- 2006
In this study, we evaluated the performance of a companding strategy as a preprocessing for speech enhancement and noise reduction. The proposed algorithm is based on two tone suppression that is human's hearing characteristics. This algorithm enhances spectral peak of speech signal and reduces background noise, however it has tradeoff characteristics between speech distortion and noise reduction due to limited channel number and nonlinear block. Therefore, we designed two different companding structures that have relative characteristics of noise reduction and speech distortion and found suitable companding structures by difference of individual speech perception ability in noise environment. Thus we proposed speech perception enhancement of cochlear implant user in noise environment with low SNR. The performance of the proposed algorithm was evaluated through 5 normal hearing listeners using noise band simulation. Improvement of speech perception was observed for all subjects and each subject preferred the different type of companding structure.
PDF KSCI

Adaptive Block Recovery Based on Subband Energy and DC Value in Wavelet Domain (웨이블릿 부대역의 에너지와 DC 값에 근거한 적응적 블록 복구)

Hyun, Seung-Hwa;Eom, Il-Kyu;Kim, Yoo-Shin
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.5 s.305
- /
- pp.95-102
- /
- 2005
When images compressed with block-based compression techniques are transmitted over a noisy channel, unexpected block losses occur. In this paper, we present a post-processing-based block recovery scheme using Haar wavelet features. No consideration of the edge-direction, when recover the lost blocks, can cause block-blurring effects. The proposed directional recovery method in this paper is effective for the strong edge because exploit the varying neighboring blocks adaptively according to the edges and the directional information in the image. First, the adaptive selection of neighbor blocks is performed based on the energy of wavelet subbands (EWS) and difference of DC values (DDC). The lost blocks are recovered by the linear interpolation in the spatial domain using selected blocks. The method using only EWS performs well for horizontal and vertical edges, but not as well for diagonal edges. Conversely, only using DDC performs well diagonal edges with the exception of line- or roof-type edge profiles. Therefore, we combined EWS and DDC for better results. The proposed methods out performed the previous methods using fixed blocks.
PDF KSCI

Low Complexity Video Encoding Using Turbo Decoding Error Concealments for Sensor Network Application (센서네트워크상의 응용을 위한 터보 복호화 오류정정 기법을 이용한 경량화 비디오 부호화 방법)

Ko, Bong-Hyuck;Shim, Hyuk-Jae;Jeon, Byeung-Woo
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.45 no.1
- /
- pp.11-21
- /
- 2008
In conventional video coding, the complexity of encoder is much higher than that of decoder. However, as more needs arises for extremely simple encoder in environments having constrained energy such as sensor network, much investigation has been carried out for eliminating motion prediction/compensation claiming most complexity and energy in encoder. The Wyner-Ziv coding, one of the representative schemes for the problem, reconstructs video at decoder by correcting noise on side information using channel coding technique such as turbo code. Since the encoder generates only parity bits without performing any type of processes extracting correlation information between frames, it has an extremely simple structure. However, turbo decoding errors occur in noisy side information. When there are high-motion or occlusion between frames, more turbo decoding errors appear in reconstructed frame and look like Salt & Pepper noise. This severely deteriorates subjective video quality even though such noise rarely occurs. In this paper, we propose a computationally extremely light encoder based on symbol-level Wyner-Ziv coding technique and a new corresponding decoder which, based on a decision whether a pixel has error or not, applies median filter selectively in order to minimize loss of texture detail from filtering. The proposed method claims extremely low encoder complexity and shows improvements both in subjective quality and PSNR. Our experiments have verified average PSNR gain of up to 0.8dB.
PDF KSCI

Improving Recall for Context-Sensitive Spelling Correction Rules using Conditional Probability Model with Dynamic Window Sizes (동적 윈도우를 갖는 조건부확률 모델을 이용한 한국어 문맥의존 철자오류 교정 규칙의 재현율 향상)

Choi, Hyunsoo;Kwon, Hyukchul;Yoon, Aesun
- Journal of KIISE
- /
- v.42 no.5
- /
- pp.629-636
- /
- 2015
The types of errors corrected by a Korean spelling and grammar checker can be classified into isolated-term spelling errors and context-sensitive spelling errors (CSSE). CSSEs are difficult to detect and to correct, since they are correct words when examined alone. Thus, they can be corrected only by considering the semantic and syntactic relations to their context. CSSEs, which are frequently made even by expert wiriters, significantly affect the reliability of spelling and grammar checkers. An existing Korean spelling and grammar checker developed by P University (KSGC 4.5) adopts hand-made correction rules for correcting CSSEs. The KSGC 4.5 is designed to obtain very high precision, which results in an extremely low recall. Our overall goal of previous works was to improve the recall without considerably lowering the precision, by generalizing CSSE correction rules that mainly depend on linguistic knowledge. A variety of rule-based methods has been proposed in previous works, and the best performance showed 95.19% of average precision and 37.56% of recall. This study thus proposes a statistics based method using a conditional probability model with dynamic window sizes. in order to further improve the recall. The proposed method obtained 97.23% of average precision and 50.50% of recall.
https://doi.org/10.5626/JOK.2015.42.5.629 인용 KSCI

Speech extraction based on AuxIVA with weighted source variance and noise dependence for robust speech recognition (강인 음성 인식을 위한 가중화된 음원 분산 및 잡음 의존성을 활용한 보조함수 독립 벡터 분석 기반 음성 추출)

Shin, Ui-Hyeop;Park, Hyung-Min
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.3
- /
- pp.326-334
- /
- 2022
In this paper, we propose speech enhancement algorithm as a pre-processing for robust speech recognition in noisy environments. Auxiliary-function-based Independent Vector Analysis (AuxIVA) is performed with weighted covariance matrix using time-varying variances with scaling factor from target masks representing time-frequency contributions of target speech. The mask estimates can be obtained using Neural Network (NN) pre-trained for speech extraction or diffuseness using Coherence-to-Diffuse power Ratio (CDR) to find the direct sounds component of a target speech. In addition, outputs for omni-directional noise are closely chained by sharing the time-varying variances similarly to independent subspace analysis or IVA. The speech extraction method based on AuxIVA is also performed in Independent Low-Rank Matrix Analysis (ILRMA) framework by extending the Non-negative Matrix Factorization (NMF) for noise outputs to Non-negative Tensor Factorization (NTF) to maintain the inter-channel dependency in noise output channels. Experimental results on the CHiME-4 datasets demonstrate the effectiveness of the presented algorithms.
https://doi.org/10.7776/ASK.2022.41.3.326 인용 PDF KSCI

Evaluation of Debonding Defects in Railway Concrete Slabs Using Shear Wave Tomography (전단파 토모그래피를 활용한 철도 콘크리트 궤도 슬래브 층분리 결함 평가)

Lee, Jin-Wook;Kee, Seong-Hoon;Lee, Kang Seok
- Journal of the Korea institute for structural maintenance and inspection
- /
- v.26 no.3
- /
- pp.11-20
- /
- 2022
The main purpose of this study is to investigate the applicability of the shear wave tomography technology as a non-destructive testing method to evaluate the debonding between the track concrete layer (TCL) and the hydraulically stabilized based course (HSB) of concrete slab tracks for the Korea high-speed railway system. A commercially available multi-channel shear wave measurement device (MIRA) is used to evaluate debonding defects in full-scaled mock-up test specimen that was designed and constructed according to the Rheda 200 system. A part of the mock-up specimen includes two artificial debonding defects with a length and a width of 400mm and thicknesses of 5mm and 10mm, respectively. The tomography images obtained by a MIRA on the surface of the concrete specimens are effective for visualizing the debonding defects in concrete. In this study, a simple image processing method is proposed to suppress the noisy signals reflected from the embedded items (reinforcing steel, precast sleeper, insert, etc.) in TCL, which significantly improves the readability of debonding defects in shear wave tomography images. Results show that debonding maps constructed in this study are effective for visualizing the spatial distribution and the depths of the debondiing defects in the railway concrete slab specimen.
https://doi.org/10.11112/jksmi.2022.26.3.11 인용 PDF KSCI

An Acoustic Event Detection Method in Tunnels Using Non-negative Tensor Factorization and Hidden Markov Model (비음수 텐서 분해와 은닉 마코프 모델을 이용한 터널 환경에서의 음향 사고 검지 방법)

Kim, Nam Kyun;Jeon, Kwang Myung;Kim, Hong Kook
- Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
- /
- v.8 no.9
- /
- pp.265-273
- /
- 2018
In this paper, we propose an acoustic event detection method in tunnels using non-negative tensor factorization (NTF) and hidden Markov model (HMM) applied to multi-channel audio signals. Incidents in tunnel are inherent to the system and occur unavoidably with known probability. Incidents can easily happen minor accidents and extend right through to major disaster. Most incident detection systems deploy visual incident detection (VID) systems that often cause false alarms due to various constraints such as night obstacles and a limit of viewing angle. To this end, the proposed method first tries to separate and detect every acoustic event, which is assumed to be an in-tunnel incident, from noisy acoustic signals by using an NTF technique. Then, maximum likelihood estimation using Gaussian mixture model (GMM)-HMMs is carried out to verify whether or not each detected event is an actual incident. Performance evaluation shows that the proposed method operates in real time and achieves high detection accuracy under simulated tunnel conditions.
https://doi.org/10.21742/AJMAHS.2018.09.66 인용

A Novel Approach to a Robust A Priori SNR Estimator in Speech Enhancement (음성 향상에서 강인한 새로운 선행 SNR 추정 기법에 관한 연구)

Park, Yun-Sik;Chang, Joon-Hyuk
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.8
- /
- pp.383-388
- /
- 2006
This Paper presents a novel approach to single channel microphone speech enhancement in noisy environments. Widely used noise reduction techniques based on the spectral subtraction are generally expressed as a spectral gam depending on the signal-to-noise ratio (SNR). The well-known decision-directed(DD) estimator of Ephraim and Malah efficiently reduces musical noise under the background noise conditions, but generates the delay of the a prioiri SNR because the DD weights the speech spectrum component of the Previous frame in the speech signal. Therefore, the noise suppression gain which is affected by the delay of the a priori SNR, which is estimated by the DD matches the previous frame rather than the current one, so after noise suppression. this degrades the noise reduction performance during speech transient periods. We propose a computationally simple but effective speech enhancement technique based on the sigmoid type function for the weight Parameter of the DD. The proposed approach solves the delay problem about the main parameter, the a priori SNR of the DD while maintaining the benefits of the DD. Performances of the proposed enhancement algorithm are evaluated by ITU-T p.862 Perceptual Evaluation of Speech duality (PESQ). the Mean Opinion Score (MOS) and the speech spectrogram under various noise environments and yields better results compared with the fixed weight parameter of the DD.
https://doi.org/10.7776/ASK.2006.25.8.383 인용 PDF KSCI

Search Result 138, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)