통합 검색 | Korea Science

음성 신호 특징과 셉스트럽 특징 분포에서 묵음 특징 정규화를 융합한 음성 인식 성능 향상 (Voice Recognition Performance Improvement using the Convergence of Voice signal Feature and Silence Feature Normalization in Cepstrum Feature Distribution)

황재천
- 한국융합학회논문지
- /
- 제8권5호
- /
- pp.13-17
- /
- 2017
음성 인식에서 기존의 음성 특징 추출 방법은 명확하지 않은 스레숄드 값으로 인해 부정확한 음성 인식률을 가진다. 본 연구에서는 음성과 비음성에 대한 특징 추출을 묵음 특징 정규화를 융합한 음성 인식 성능 향상을 위한 방법을 모델링 한다. 제안한 방법에서는 잡음의 영향을 최소화하여 모델을 구성하였고, 각 음성 프레임에 대해 음성 신호 특징을 추출하여 음성 인식 모델을 구성하였고, 이를 묵음 특징 정규화를 융합하여 에너지 스펙트럼을 엔트로피와 유사하게 표현하여 원래의 음성 신호를 생성하고 음성의 특징이 잡음을 적게 받도록 하였다. 셉스트럼에서 음성과 비음성 분류의 기준 값을 정하여 신호 대 잡음 비율이 낮은 신호에서 묵음 특징 정규화로 성능을 향상하였다. 논문에서 제시하는 방법의 성능 분석은 HMM과 CHMM을 비교하여 결과를 보였으며, 기존의 HMM과 CHMM을 비교한 결과 음성 종속 단계에서는 2.1%p의 인식률 향상이 있었으며, 음성 독립 단계에서는 0.7%p 만큼의 인식률 향상이 있었다.
https://doi.org/10.15207/JKCS.2017.8.5.013 인용 PDF KSCI

Compound Source MBE를 이용한 InGaP/InGaAs p-HEMT 구조의 성장 및 특성 분석 (Growth and Characterization of InGaP/InGaAs p-HEMI Using Compound Source MBE)

Kim, J.H.;S.J. Kang;S.J. Jo;J.D. Song;Lee, Y.T.;J.I. Song
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2000년도 하계종합학술대회 논문집(2)
- /
- pp.16-19
- /
- 2000
DC and low frequency noise characteristics of InGaP/InGaAs pseudomorphic HEMTs (p-HEMTs) grown by compound source MBE are investigated for temperature range of 150K to 370K. Equivalent input noise spectra( $S_{iv}$ ) were measured as a function of frequency and temperature. $S_{iv}$ was measured to be 3.4 $\times$ 10$^{-12}$ $V^2$/ Hz at 1kHz for 1.3 X 50${\mu}{\textrm}{m}$$^2$InGaP/InGaAs p-HEMT at room temperature. Measurements of the low-frequency noise spectra of the p-HEMT as a function of temperature show that the trap with an activation energy level around 0.589 eV is a dominant trap that accounts for the low-frequency noise behavior of the device. The normalized extrinsic gm frequency dispersion of the p-HEMT. was as low as 2.5％ at room temperature, indicating that the device has well-behaved low-frequency noise characteristics. Sub-micron (0.25 $\times$ 50${\mu}{\textrm}{m}$$^2$) gate p-HEMT showed $f_{T}$ and $f_{max}$ of 40GHz and 108GHz, respectively.y.y.
PDF

자동차 부품에 대한 다축 진동내구 시험방법 (Multi-axial Vibration Testing Methodology of Vehicle Component)

김찬중;배철용;이동원;권성진;이봉현;나병철
- 한국소음진동공학회:학술대회논문집
- /
- 한국소음진동공학회 2007년도 추계학술대회논문집
- /
- pp.297-302
- /
- 2007
Vibrating test of vehicle component can be possible in lab-based simulators instead of field testing owing to the development of technology in control algorithm as well as computational process. Currently, Multi-Axial Simulation Table(MAST) is recommended as a vibrating equipment, which excites a target component for 3-directional translation and rotation motion simultaneously and hence, vibrational condition can be fully approximated to that of real road test. But, the vibration-free performance of target component is not guaranteed with MAST system, which is only simulator subjective to the operator. Rather, the reliability of multi-axial vibration test is dependent on the quality of input profile which should cover the required severity of vibrating condition on target component. In this paper, multi-axial vibration testing methodology of vehicle component is presented here, from data acquisition of vehicle accelerations to the obtaining the input profile of MAST using severe data at proving ground. To compare the severity of vibration condition, between real road test and proving ground one, energy principle of equivalent damage is proposed to calculate energy matrices of acceleration data and then, it is determined the optimal combination of special events on proving ground which is equivalent to real road test at the aspects of vibration fatigue using sequential searching optimal algorithm. To explain the vibration methodology clearly, seat and door component of vehicle are selected as a example.
PDF

비성이 음질에 미치는 영향에 대한 음향학적 연구 (The Effects of Nasalance on Quality of Voice)

안종복;신명선;노동우;박은아;정옥란
- 음성과학
- /
- 제9권3호
- /
- pp.133-140
- /
- 2002
The purpose of this study was to investigate any changes in acoustic qualities of voice as ,a function of nasalance, in order to determine the relationship between vocal quality and nasalance. Twenty normal subjects (10 males and 10 females) vocalized /a/, /$\tilde{a}$/, and /a $\eta$/. The changes in nasalance and acoustic characteristics of the voice were analyzed by Nasometer (Model 6200-3, Kay Elemetrics, co) and Dr, Speech 4.0 (Tiger Electronics, Co), respectively. One-way ANOVA was used to examine any changes in jitter, shimmer, harmonics-to-noise ratio, and normalized noise energy relative to the nasalance in 3 types of vocalization. The Person r correlation coefficient was used to identify the relationship between the nasalance and the vocal quality. There was no statistically significant changes in jitter, shimmer, HNR and NNE. The jitter, however, tended to increase as the nasalance socre increased, compared to the other vocal parameters. In addition, the NNE showed an increase on / $\tilde{a}$/, and /a $\eta$/, more on the /a $\eta$/. Thus, it was speculated that NNE could be used to identify or screen resonant disorders with hypernasality
PDF

에너지와 인근피치간에 유사도를 이용한 잡음레벨 검출에 관한 연구 (A Study on the Noise-Level Measurement using the Energy and relation of closed pitch)

강인규;배명진
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 2004년도 춘계학술발표대회 논문집 제23권 1호
- /
- pp.77-80
- /
- 2004
인간은 "습관적 피치 레벨" 즉 자연스럽게 말할 때 평균적으로 사용하는 피치를 갖는다. 하지만 음성에 잡음이 첨가 되면 이 피치가 불규칙하게 바뀌게 된다. 이점을 이용하여 음성의 잡음레벨을 측정할 수 있다. 본 논문에서는 입력음성의 에너지를 구하고 일정 에너지레벨 이상에서의 구간에 대해 NAMDF(Normalized Average Magnitude Difference Function)방법으로 피치를 구하고, 각 프레임을 피치단위로 분절한 뒤 인근 피치간의 유사도를 측정하여 입력음성데이터의 잡음레벨을 검출하는 방법을 제안하였다.
PDF

음성합성을 이용한 병적 음성의 치료 결과에 대한 예측 (Prediction of Post-Treatment Outcome of Pathologic Voice Using Voice Synthesis)

이주환;최홍식;김영호;김한수;최현승;김광문
- 대한후두음성언어의학회지
- /
- 제14권1호
- /
- pp.30-39
- /
- 2003
Background and Objectives : Patients with pathologic voice often concern about recovery of voice after surgery. In our investigation, we give controlled values of three parameters of voice synthesis program of Dr. Speech Science. such as jitter, shimmer, and NNE(normalized noise energy) which characterize someone's voice from others and deviced a method to synthesize the predicted voice after performing operation. Subjects and Method : Values of vocal jitter, vocal shimmer, and glottal noise were measured with voices of 10 vocal cord Paralysis and 10 vocal Polyp Patients 1 week Prior to and 1 month after the surgery. With Dr. Speech science voice synthesis program we synthesized 'ae' vowel which is closely identical to preoperative and post-operative voice of the patients by controlling the values of jitter, shimmer, and glottal noise. then we analyzed the synthesized voices and compared with pre and post-operative voice. Results : 1) After inputting the preoperative and corrected values of jitter, shimmer, and glottal noise into the voice synthesis Program, voices identical to vocal Polyp Patients' Pre- and Postoperative voices withiin statistical significance were synthesized 2) After elimination of synergistic effects between three paramenter, we were able to synthesize voice identical to vocal paralysis patients' preoperative voices. 3) After inputting only slightly increased jitter, shimmer into the synthesis program, we were able to synthesize voice identical to vocal cord paralysis patients' postoperative voices. Conclusion : Voices synthesized with Dr. Speech science program were identical to patients' actual pre and postoperative voice, and clinicians will be able to give the patients more information and thus increased patients cooperability can be expected.
PDF

저 전송률 환경에서 선형예측 전처리기를 사용한 HE-AAC의 성능 향상 (Quality Improvement of Low Bitrate HE-AAC using Linear Prediction Pre-processor)

이재성;이건우;박영철;윤대희
- 한국통신학회논문지
- /
- 제34권8C호
- /
- pp.822-829
- /
- 2009
본 논문은 선형예측 전처리기을 이용하여 저 전송률 환경에 적합한 HE-AAC의 구조를 제안한다. 저 전송률 환경에서는 HE-AAC의 적절하지 못한 비트 할당 알고리즘 때문에 많은 스펙트럴 홀(스펙트럼 홀)들이 발생을 하고 있으며, 그로 인해서 심각한 음질의 열화가 발생하고 있다. 이를 해결하기 위해서 선형예측 전처리기을 사용하여 저 전송률에서 비트가 적절하게 할당되도록 하였다. HE-AAC로 들어오는 입력신호는 선형예측 전처리기에 의해서 LP 계수와 레지듀얼 신호로 나눠지게 되며, AAC 부분은 분리된 레지듀얼 신호를 부호화하게 된다. 제안된 방법의 성능 평가를 위해서 지각적 잡음(Perceptual noise)의 측정을 통한 객관적인 실험과 MUSHRA 테스트를 통한 주관적인 실험을 하였고, 그 결과 저 전송률 환경에서 제안된 방법을 사용함으로써 성능을 향상시킬 수 있음을 확인하였다.
PDF KSCI

롬바드 효과의 보정을 위한 스펙트럼 크기의 정규화와 켑스트럼 변환 (Normalization of Spectral Magnitude and Cepstral Transformation for Compensation of Lombard Effect)

지상문;오영환
- 한국음향학회지
- /
- 제15권4호
- /
- pp.83-92
- /
- 1996
본 연구에서는 음성인식기의 성능이 잡음환경하에서 급격히 저하되는 것을 완화하기 위해, 성능저하의 원인인 롬바드효과의 보정과 잡음의 제거방법을 제안하였다. 롬바드 효과는 조용한 환경에서 발성된 음성에 비해, 스펙트럼 포락과 발성음의 세기를 변이 시키는 것으로 모델링하였고, 변이의 제거를 위해 스펙트럼 크기의 정규화와 켑스트럼 변환을 사용하였다. 주변 잡음의 첨가에 의한 음성신호의 왜곡은 스펙트럼 차감법을 사용하여 완화하였고, 음성의 동적인 특성을 강조하기 위해 대역통과 필터링을 하였다. 잡음환경에서 발성된 롬바드 음성의 분석 및 잡음처리 기술의 개발과 평가를 위해, 음성인식 기술의 적용이 예상되는 자동차, 전시장, 시내 공중전화 부스, 거리, 전산실 잡음을 이용하여 롬바드 음성을 수집하여 실험하였다. 제안한 방법을 여러 가지 잡음환경하에서 음성인식에 적용한 결과, 효과적인 잡음처리 방법임을 확인할 수 있었다.
PDF

근사기법을 활용한 공진형 파력발전 부이의 발전량 추정 및 최적설계 (Power Estimation and Optimum Design of a Buoy for the Resonant Type Wave Energy Converter Using Approximation Scheme)

고혁준;유원선;조일형
- 한국해양공학회지
- /
- 제27권1호
- /
- pp.85-92
- /
- 2013
This paper deals with the resonant type of a WEC (wave energy converter) and the determination method of its geometric parameters which were obtained to construct the robust and optimal structure, respectively. In detail, the optimization problem is formulated with the constraints composed of the response surfaces which stand for the resonance period(heave, pitch) and the meta center height of the buoy. Use of a signal-to-noise ratio calculated from normalized multi-objective results with the weight factor can help to select the robust design level. In order to get the sample data set, the motion responses of the power buoy were analyzed using the BEM (boundary element method)-based commercial code. Also, the optimization result is compared with a robust design for a feasibility study. Finally, the power efficiency of the WEC with the optimum design variables is estimated as the captured wave ratio resulting from absorbed power which mainly related to PTO (power take off) damping. It could be said that the resultant of the WEC design is the economical optimal design which satisfy the given constraints.
https://doi.org/10.5574/KSOE.2013.27.1.085 인용 PDF KSCI

Effects of Korean Syllable Structure on English Pronunciation

Lee, Mi-Hyun;Ryu, Hee-Kwan
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2000년도 7월 학술대회지
- /
- pp.364-364
- /
- 2000
It has been widely discussed in phonology that syllable structure of mother tongue influences one's acquisition of foreign language. However, the topic was hardly examined experimentally. So, we investigated effects of Korean syllable structure when Korean speakers pronounce English words, especially focusing on consonant strings that are not allowed in Korean. In the experiment, all the subjects are divided into 3 groups, that is, native, experienced, and inexperienced speakers. Native group consists of 1 male English native speaker. Experienced and inexperienced are each composed of 3 male Korean speakers. These 2 groups are divided by the length of residence in the country using English as a native language. 41 mono-syllable words are prepared considering the position (onset vs. coda), characteristic (stops, affricates, fricatives), and number of consonant. Then, the length of the consonant cluster is measured. To eliminate tempo effect, the measured length is normalized using the length of the word 'say' in the carrier sentence. Measurement of consonant cluster is the relative time period between the initiation of energy (onset I coda) which is acoustically representative of noise (consonant portion) and voicing. bar (vowel portion) in a syllable. Statistical method is used to estimate the differences among 3 groups. For each word, analysis of variance (ANDY A) and Post Hoc tests are carried out.
PDF

검색결과 40건 처리시간 0.024초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)