통합 검색 | Korea Science

통계적 기법을 이용한 화자변화 검출 실험 (A Speaker Change Detection Experiment that Uses a Statistical Method)

이경록;김진영
- 음성과학
- /
- 제8권4호
- /
- pp.59-72
- /
- 2001
In this paper, we experimented with speaker change detection that uses a statistical method for NOD (News On Demand) service. A specified speaker's change can find out content of each data in speech if analysed because it means change of data contents in news data. Speaker change detection acts as preprocessor that divide input speech by speaker. This is an important preprocessor phase for speaker tracking. We detected speaker change using GLR(generalized likelihood ratio) distance base division and BIC (Bayesian information criterion) base division among matrix method. An experiment verified speaker change point using BIC base division after divide by speaker unit using GLR distance base method first. In the experimental result, FAR (False Alarm Rate) was 63.29 in high noise environment and FAR was 54.28 in low noise environment in MDR (Missed Detection Rate) 15% neighborhood.
PDF

라인케부종환자의 음성분석 (Acoustic Analysis of Reinke Edema)

김상균;최홍식;공석철;홍원표
- 대한후두음성언어의학회지
- /
- 제7권1호
- /
- pp.11-19
- /
- 1996
Reinke's edema is used for describing varying degrees of chronic swelling of the vocal folds. The acoustic analysis of Reinke's edema has not been reported so far in this country. The purpose of this study is to clarify acoustic and aerodynamic characteristics of the Reinke's edema. Several acoustic evaluations ＆ aerodynamic studies were done in 20 Reinke's edema patients and the data was compared with those of 20 normal controls. Videolaryngoscopy also was done to classify the severity in grading. We used C-Speech, Doctor speech science, and Phonatory function analyser. In C-Speech, we compared jitter, shimmer, and SNR(signal to noise ratio) of normal and Rrinke's edema patient. In Doctor speech science, we compared NNE(Glottal noise energy), speech fundamental frequency, voice quality between two groups. And in phonatory function analyser for aerodynamic function test, we compared speech intensity, airflow rate, and expiratory pressure between two groups. In conclusion, Reinke's edema patients showed lower voice pitches than normal, additionally jitter, shimmer, SNR(signal to noise ratio), NNE(Glottal noise energy), airflow rate, and expiratory pressure may be meaningful parameters for diagnosis and prognosis for treatment.
PDF

음성 개선 기반의 모델 보상 기법을 이용한 강인한 잡음 음성 인식 (A Noise Robust Speech Recognition Method Using Model Compensation Based on Speech Enhancement)

신광호;정호열;정현열
- 한국음향학회지
- /
- 제27권4호
- /
- pp.191-199
- /
- 2008
본 논문에서는 잡음 환경하의 음성 인식을 위해 전처리 단계에서 Mel-warped Wiener Filtering (MWF) 기법을 이용하여 입력 음성을 개선하고 후처리 단계에서 PMC (Parallel Model Combination) 기법을 이용하여 인식 모델을 보상하는 MWF-PMC잡음 처리 기법을 제안한다. PMC 기법은 전처리 단계에서 개선된 음성의 묵음 구간으로부터 잔류 잡음을 취하여 깨끗한 음성을 이용하여 작성한 인식 모델을 보상함으로써 잡음 환경하의 음성 인식 성능을 향상시킬 수 있다. 인식 실험을 위한 음성 데이터는 국어공학연구소 (KLE)에서 작성한 PBW (Phoneme Balanced Words) 452 단어 음성 데이터를 8 kHz로 다운 샘플링한 후 Subway, Car 및 Exhibition 잡음을 5단계의 신호 대 잡음비 (SNR)를 0, 5, 10, 15, 2003로 부가하여 구성하였다. 인식 실험 결과, 본 논문에서 제안한 MWF-PMC 기법이 기존의 결합된 기법보다 전반적으로 향상된 인식 성능을 얻어 그 유효성을 확인할 수 있었다.
https://doi.org/10.7776/ASK.2008.27.4.191 인용 PDF KSCI

음성활동영역검색을 사용하는 유색잡음에 오염된 음성의 향상을 위한 일반화 부공간 접근 (A Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise Using Voice Activity Detector(VAD))

손경식;김현태
- 한국정보통신학회논문지
- /
- 제17권8호
- /
- pp.1769-1776
- /
- 2013
본 논문에서는 유색잡음에 의해 오염된 음성신호의 음성향상 알고리즘인 YL 접근법에 VAD(voice activity detector)를 구현하는 수정된 알고리즘을 제안한다. 제안한 알고리즘을 YL 접근법 및 LS 접근법과 컴퓨터 시뮬레이션으로 성능을 비교하였다. 사용한 유색잡음은 자동차 잡음과 다중화자 배블 잡음으로 AURORA 데이터베이스로 부터 각각 발췌하였고, 음성신호는 TIMIT 데이터 베이스로부터 발췌하였다. 제안한 알고리즘을 실험했을 때 제안하는 방법이 신호대잡음비 및 스펙트럼 왜곡 측면에서 기존의 두 알고리즘 보다 개선됨을 확인하였다.
https://doi.org/10.6109/jkiice.2013.17.8.1769 인용 PDF KSCI

배경 잡음환경에서 가변 임계값에 의한 Dual Rate ADPCM 음성 부호화 기법 (Coding Method of Variable Threshold Dual Rate ADPCM Speech Considering the Background Noise)

한경호
- 조명전기설비학회논문지
- /
- 제17권6호
- /
- pp.154-159
- /
- 2003
본 논문에서는 ITU G.726 규격을 만족하는 표준형 ADPCM 부호화 법을 이용하여 배경잡음의 크기에 따라 음성의 부호화율이 두가지로 가변하도록 함으로써, 낮은 데이터 전송률을 가지고도 단일 부호화율의 경우보다 개선된 음질을 갖는 부호화 기법을 제안하였다. 이를 위하여 배경잡음보다 큰 음성신호에 대하여는 데이터의 양이 커지더라도 음질을 향상시키기 위하여 40 [Kbps]로 압축하고, 작은 음성신호에 대하여는 16[Kbps]로 압축하여 데이터의 양을 줄이도록 하여 전체적으로 압축데이터의 양을 줄이면서 음질을 개선하도록 하였다. 입력된 음성신호에 대하여 두가지 압축율을 결정하기 위하여 영교차율(ZCR)을 사용하여 처리속도를 빠르도록 하였다.
https://doi.org/10.5207/JIEIE.2003.17.6.154 인용 PDF KSCI

An Adaptation Method in Noise Mismatch Conditions for DNN-based Speech Enhancement

Xu, Si-Ying;Niu, Tong;Qu, Dan;Long, Xing-Yan
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제12권10호
- /
- pp.4930-4951
- /
- 2018
The deep learning based speech enhancement has shown considerable success. However, it still suffers performance degradation under mismatch conditions. In this paper, an adaptation method is proposed to improve the performance under noise mismatch conditions. Firstly, we advise a noise aware training by supplying identity vectors (i-vectors) as parallel input features to adapt deep neural network (DNN) acoustic models with the target noise. Secondly, given a small amount of adaptation data, the noise-dependent DNN is obtained by using $L_2$ regularization from a noise-independent DNN, and forcing the estimated masks to be close to the unadapted condition. Finally, experiments were carried out on different noise and SNR conditions, and the proposed method has achieved significantly 0.1%-9.6% benefits of STOI, and provided consistent improvement in PESQ and segSNR against the baseline systems.
https://doi.org/10.3837/tiis.2018.10.017 인용 PDF KSCI

주파수 변이를 이용한 Parallel Model Combination 모델 적응에 기반한 잡음에 강한 음성인식 (Noise Robust Speech Recognition Based on Parallel Model Combination Adaptation Using Frequency-Variant)

최숙남;정현열
- 한국음향학회지
- /
- 제32권3호
- /
- pp.252-261
- /
- 2013
일반적인 음성인식 시스템은 조용한 인식 환경에서는 높은 인식성능을 나타내지만 잡음이 존재하는 실제 환경에서는 그 성능이 급격히 저하한다. 본 논문에서는 다양한 잡음환경에서도 강인한 음성인식기를 구현하기 위하여, 주파수의 변이도를 이용하여 음성인식을 위한 환경 정보를 얻고 이를 음성 인식을 위한 모델 개선에 적용하여 성능향상을 도모하는 환경정보 지식에 기반한 주파수 변이 적응 PMC (Parallel Model Combination adaptation using frequency-variant based on environment - awareness : FV-PMC) 방법을 제안한다. 이 방법은 미리 분류된 각 잡음 군간의 평균 주파수 변이도를 미리 계산하여 임계치로 설정하고 미지의 잡음이 포함된 음성이 입력되면 각 잡음 군과의 주파수 변이도를 다시 계산하여 해당 잡음군의 임계치 보다 높을 경우 그 잡음 군의 잡음이 포함된 음성으로 간주하여 이 잡음 군이 포함된 음성을 이용하여 생성된 인식모델을 이용하여 음성인식을 수행한다. 제안한 FV-PMC 방법을 이용하여 잡음을 분류 하였을 경우 평균 분류 정확도는 56%를 보였고 이를 이용해 음성인식 실험을 실시한 결과 Set A의 평균인식률은 79.05%, Set B의 평균인식률은 79.43%, Set C의 평균인식률은 83.37%로 나타났다. 전체 평균인식률 80.62%로 기존의 깨끗한 모델을 이용한 PMC 인식률 74.93% 보다 5.69% 향상된 결과를 보여 제안한 방법의 유효성을 확인할 수 있었다.
https://doi.org/10.7776/ASK.2013.32.3.252 인용 PDF KSCI

Energy Feature Normalization for Robust Speech Recognition in Noisy Environments

Lee, Yoon-Jae;Ko, Han-Seok
- 음성과학
- /
- 제13권1호
- /
- pp.129-139
- /
- 2006
In this paper, we propose two effective energy feature normalization methods for robust speech recognition in noisy environments. In the first method, we estimate the noise energy and remove it from the noisy speech energy. In the second method, we propose a modified algorithm for the Log-energy Dynamic Range Normalization (ERN) method. In the ERN method, the log energy of the training data in a clean environment is transformed into the log energy in noisy environments. If the minimum log energy of the test data is outside of a pre-defined range, the log energy of the test data is also transformed. Since the ERN method has several weaknesses, we propose a modified transform scheme designed to reduce the residual mismatch that it produces. In the evaluation conducted on the Aurora2.0 database, we obtained a significant performance improvement.
PDF

서브밴드에 기반한 스펙트럼 차감 알고리즘 (Subband Based Spectrum Subtraction Algorithm)

최재승
- 한국전자통신학회논문지
- /
- 제8권4호
- /
- pp.555-560
- /
- 2013
본 논문에서는 거리측정, 로그전력, 실효치 방법에 의하여 유성음, 무성음, 묵음 구간을 검출하여, 서브밴드 필터에 의한 잡음제거 알고리즘을 제안한다. 제안한 알고리즘은 각 프레임에서 서브밴드 필터를 사용하여 잡음으로 오염된 음성신호로부터 백색잡음 및 도로잡음의 스펙트럼을 차감하는 방법이다. 본 실험에서는 Aurora-2 데이터베이스에 포함된 음성신호와 잡음신호를 사용하여 스펙트럼 차감 알고리즘의 결과를 나타낸다. 잡음에 의하여 오염된 음성신호에 대하여 신호대잡음비를 사용하여 본 알고리즘이 유효하다는 것을 확인한다. 실험으로부터 백색잡음에 대하여 평균 2.1 dB, 도로잡음에 대하여 평균 1.91 dB의 출력 신호대잡음비가 개선된 것을 확인할 수 있었다.
https://doi.org/10.13067/JKIECS.2013.8.4.555 인용 PDF KSCI

디지털 이동통신을 위한 비트 선택적 에러정정부호 (Bit-selective Forward Error Correction for Digital Mobile Communications)

양경철;이재홍
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1988년도 전기.전자공학 학술대회 논문집
- /
- pp.198-202
- /
- 1988
In digital mobile communications received speech data are affected by burst errors as well as random errors. To overcome these errors we propose a bit-selective forward error correction scheme for the speech data which is sub-band coded at 13 kbps and transmitted over a 16 kbps channel. For a few error correcting codes the signal-to-noise ratio of error-corrected speech is obtained and compared through the simulation of mobile communication channels.
PDF

검색결과 144건 처리시간 0.024초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)