통합 검색 | Korea Science

CASA 시스템의 청각장면과 PAR를 이용한 음성 영역 검출에 관한 연구 (A Study on Voice Activity Detection Using Auditory Scene and Periodic to Aperiodic Component Ratio in CASA System)

김정호;고형화;강철호
- 전자공학회논문지
- /
- 제50권10호
- /
- pp.181-187
- /
- 2013
인간의 청각은 청각 장면 분석을 통해 배경 잡음이나 여러 사람들이 동시에 말하는 상황에서도 특정 목적을 가지는 음성 신호를 청취할 수 있는 능력을 가지고 있다. 인간의 청각 능력 시스템을 잘 반영한 CASA 시스템을 이용해 음성을 분리를 할 수 있다. 그러나 CASA 세그먼트에서 음성의 위치를 잘못 결정 했을 때 CASA 시스템의 성능은 감소된다. 본 논문에서는 CASA 시스템에서 잘못된 음성 영역 위치로 인해 발생되는 성능 감소를 개선하기 위하여 청각 장면, 그리고 주기 성분과 비주기 성분의 비율(PAR)을 결합한 음성 영역 검출 알고리즘을 제안한다. 음성 영역 검출의 성능을 평가하기 위하여 백색 잡음과 자동차 잡음 환경에서 신호 대 잡음비의 변화에 따라 실험을 수행하였다. 본 논문에서는 신호 대 잡음비 15~0dB에서 기존의 알고리즘(Pitch 와 Guoning Hu)과 제안한 알고리즘을 비교한 결과, 음성 영역 검출의 정확도가 백색잡음과 자동차 잡음에서 신호 대 잡음비 15dB 에서 최대 4%, 0dB에서 최대 34% 씩 각각 향상되었다.
https://doi.org/10.5573/ieek.2013.50.10.181 인용 PDF KSCI

디지털 필터를 이용한 오디오 워터마킹 기술 (Audio Watermarking Technique Based on Digital Filter)

신승원;김종원;최종욱
- 한국정보보호학회:학술대회논문집
- /
- 한국정보보호학회 2001년도 종합학술발표회논문집
- /
- pp.464-468
- /
- 2001
In this paper, we propose a robust watermarking technique that accepts time scaling, pitch shift, add noise and a lot of lossy compression such as MP3, AAC, WMA. The technique is developed based on digital filtering. Being designed according to critical band of HAS (human auditory system), the digital filters nearly affect audio quality. Furthermore, before implementing digital filtering, wavelet transform decomposes the audio signal into several signals that is composed of specific frequencies. Designed digital filters scan the decomposed signal. The designed digital filter, band-stop filter, distorts and eliminates specific frequencies of audio signals. Watermarking detection can be accomplished by FFT (Fast Fourier Transform). Firstly, segments of audio signal are transformed by FFT. Then, the obtained amplitude spectrum by FFT is summed repeatedly. Finally the watermark detector can find filters used to watermark encoding based on eliminating frequencies. The suggested technique can embed 4bits/s in a robust manner.
PDF

Speech Enhancement Based on Psychoacoustic Model

Lee, Jingeol;Kim, Soowon
- The Journal of the Acoustical Society of Korea
- /
- 제19권3E호
- /
- pp.12-18
- /
- 2000
Psychoacoustic model based methods have recently been introduced in order to enhance speech signals corrupted by ambient noise. In particular, the perceptual filter is analytically derived where the frequency content of the input noisy signal is made the same as that of the estimated clean signal in auditory domain. However, the analytical derivation should rely on the deconvolution associated with the spreading function in the psychoacoustic model, which results in an ill-conditioned problem. In order to cope with the problem associated with the deconvolution, we propose a novel psychoacoustic model based speech enhancement filter whose principle is the same as the perceptual filter, however the filter is derived by a constrained optimization which provides solutions to the ill-conditioned problem. It is demonstrated with artificially generated signals that the proposed filter operates according to the principle. It is shown that superior performance results from the proposed filter over the perceptual filter provided that a clean speech signal is separable from noise.
PDF

Perturbation and Perceptual Analysis of Pathological Sustained Vowels according to Signal Typing

이지연;최성희;;한민수;최홍식
- 말소리와 음성과학
- /
- 제2권2호
- /
- pp.109-115
- /
- 2010
In this paper, we investigate a signal typing on the basis of visual impression of distinctive spectrogram. Pathological voices are classified into signal type 1, 2, 3, or 4 to estimate perturbation parameters and to mark perceptual rating based on Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). The results suggest that perturbation analysis can be applied to only type 1 and 2 signals and the perceptual ratings of overall grade increase with each signal type, overall. A good inter-rater reliability is showed among three raters. We recommend that pathological voices should be marked the signal typing and CAPE-V, together, to definitely describe the characteristics of pathological voices.
PDF

Asymmetric Flankers in Comodulation Masking Release

Pourbakht, Akram;Faraji, Leila
- Journal of Audiology & Otology
- /
- 제23권1호
- /
- pp.27-32
- /
- 2019
Background and Objectives: Detection of auditory signals may be improved when maskers far from the frequency of the target signal are coherently amplitude-modulated. This improvement of signal detection is called comodulation masking release (CMR). In the CMR experiments, flankers have been usually arranged symmetrically. In practice, we will be confronted with a problem by using symmetric flankers due to the limited output of clinical audiometers, especially at high-frequency. We aimed to check whether flanker arrangement has any effect on the amount of CMR, especially when there is no flankers with a frequency higher than the signal. Subjects and Methods: Eighteen normal hearing listeners ranging in age from 20 to 46 years old participated. Symmetric (2-2) and asymmetric (3-1 and 4-0) flankers were used and then the amount of CMR compared among them. Results: Our results showed in the same numbers of flankers, there were no statistically CMR differences between symmetric and asymmetric arrangement. Also when we did not have a flanker at a frequency higher than the signal and all flankers were placed below the signal, there was no statistically difference with the symmetric arrangement. Conclusions: The asymmetry of the flankers and also omitting the flankers with a frequency higher than the signal, have no effect on CMR results. We concluded that CMR can be considered by using clinical audiometer.
https://doi.org/10.7874/jao.2018.00192 인용

Asymmetric Flankers in Comodulation Masking Release

Pourbakht, Akram;Faraji, Leila
- 대한청각학회지
- /
- 제23권1호
- /
- pp.27-32
- /
- 2019
Background and Objectives: Detection of auditory signals may be improved when maskers far from the frequency of the target signal are coherently amplitude-modulated. This improvement of signal detection is called comodulation masking release (CMR). In the CMR experiments, flankers have been usually arranged symmetrically. In practice, we will be confronted with a problem by using symmetric flankers due to the limited output of clinical audiometers, especially at high-frequency. We aimed to check whether flanker arrangement has any effect on the amount of CMR, especially when there is no flankers with a frequency higher than the signal. Subjects and Methods: Eighteen normal hearing listeners ranging in age from 20 to 46 years old participated. Symmetric (2-2) and asymmetric (3-1 and 4-0) flankers were used and then the amount of CMR compared among them. Results: Our results showed in the same numbers of flankers, there were no statistically CMR differences between symmetric and asymmetric arrangement. Also when we did not have a flanker at a frequency higher than the signal and all flankers were placed below the signal, there was no statistically difference with the symmetric arrangement. Conclusions: The asymmetry of the flankers and also omitting the flankers with a frequency higher than the signal, have no effect on CMR results. We concluded that CMR can be considered by using clinical audiometer.
https://doi.org/10.7874/jao.2018.00192 인용

웨이브릿과 마스킹 효과를 이용한 디지털 오디오 워터마킹 (A Digital Audio Watermark Using Wavelet Transform and Masking Effect)

Hwang, Won-Young;Kang, Hwan-Il;Han, Seung-Soo;Kim, Kab-Il;Kang, Hwan-Soo
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2003년도 컴퓨터소사이어티 추계학술대회논문집
- /
- pp.243-246
- /
- 2003
In this paper, we propose a new digital audio watermarking technique with the wavelet transform. The watermark is embedded by eliminating unnecessary information of audio signal based on human auditory system (HAS). This algorithm is an audio watermarking method, which does not require any original audio information in watermark extraction process. In this paper, the masking effect is used for audio watermarking, that is, post-tempera] masking effect. We construct the window with the synchronization signal and we extract the best frame in the window by using the zero-crossing rate (ZCR) and the energy of the audio signal. The watermark may be extracted by using the correlation of the watermark signal and the portion of the frame. Experimental results show good robustness against MPEG1-layer3 compression and other common signal processing manipulations. All the attacks are made after the D/A/D conversion.
PDF

ARX 모델과 적응 필터를 이용한 단일 유발 전위의 추정 (Estimation of Single Evoked Potential Using ARX Model and Adaptive Filter)

김명남;조진호
- 대한의용생체공학회:의공학회지
- /
- 제10권3호
- /
- pp.303-308
- /
- 1989
A new estimationn mothod of single-EP(evoked potential) using adaptive algorithm and paralnetrlc model is proposed. Since the EEG(eletroencephalogram) signal is stationary in short time interval the AR(autoregressive) parameters of the EEG are estimated by the Burg algorithm using the EEG of prestimulus interval. After stimulus, the single-EP is estimated by adaptive algorithm. The validity of this method is verified by the simulation for generated auditory single-EP based on parametric model.
PDF

청각 감성의 생리적 신호변화에 대한 연구

황민철;김지은;김철중
- 대한인간공학회:학술대회논문집
- /
- 대한인간공학회 1996년도 춘계학술대회논문집
- /
- pp.259-263
- /
- 1996
Psychological action is physiological response of outernal stimulus. Physiological response is accompanied b physiological signals which are EEG, EMG, GSR, ECG, BP, and tec. Physiological signals are recently studied for determination of human phychological state. Psychological activity causes electric potential of brain. Physiological signal is considered as measurement of human psychological state. Aditory sensibility which is one of the sense of human may determine differences between positive and negative feeling. EEG and GSR variation with auditory quality of stimulus can be define human negative and positive mental state. This study is to characterize parameters which can determine negative and positive psycholigical state of human.
PDF

다중 채널 디지털 보청기 알고리즘의 고정 소수점 연산 최적화 (Fixed-point Optimization of a Multi-channel Digital Hearing Aid Algorithm)

이근상;백용현;박영철
- 한국정보전자통신기술학회논문지
- /
- 제2권2호
- /
- pp.37-43
- /
- 2009
본 논문에서는 저 전력 시스템에 적합한 고정 소수점 연산기로 구현된 다중 채널 디지털 보청기 알고리즘의 최적화 기법을 제시한다. 먼저 입력 신호를 고속 MDCT(modified discrete cosine transform) 방법을 사용하여 주파수 대역 분할함으로써 알고리듬의 복잡도를 최소화 하였고, MDCT 출력을 비선형 대역 분할 과정을 거쳐 채널별 그룹핑을 한 다음, 각 채널 신호를 난청인의 청각 손실 정도에 따라 구성한 라우드니스 보상 함수(loudness compensation function: LCF)표를 이용하여 이득을 조절하고, 최종적으로 TDAC 기법을 구현하는 IMDCT(Inverse MDCT) 변환을 거쳐 보상된 출력을 합성한다. 모든 과정은 16비트 정수 연산으로 구현되며, 이득을 계산하기 위해 측정되는 로그 단위의 연산 과정은 미리 계산된 테이블과 고속 탐색 알고리듬을 이용하여 구현된다. 구성된 보청기 알고리즘의 성능을 컴퓨터 시뮬레이션을 통해 평가하였다.
PDF

검색결과 176건 처리시간 0.027초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)