• Title/Summary/Keyword: 음향음성학

Search Result 748, Processing Time 0.027 seconds

Analysis of Intention in Spoken Dialogue based on Classifying Sentence Patterns (문형구조의 분류에 따른 대화음성의 의도분석에 관한 연구)

  • Choi, Hwan-Jin;Song, Chang-Hwan;Oh, Yung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.1
    • /
    • pp.61-70
    • /
    • 1996
  • According to topics or speaker's intentions in a dialogue, utterance spoken by speaker has a different sentence structure of word combinations. Based on these facts, we have proposed the statistical approach. IDT(intention decision table), which is modeling the correlations between sentence patterns and the intention. In a IDT, the sentence is splitted into 5 different factors, and the intention of a sentence is determined by the similarity between and intention and 5 factors that have represent a sentence. From the experimental results, the IDT has indicated that the prediction rate of an intention is improved 10~18% over the word-intention correlations and is enhanced 3~12% compared with the MIG(Markov intention graph) that models the intention with a transition graph for word categories in a sentence. Based on these facts, we have found that the IDT is effective method for the prediction of an intention.

  • PDF

A DCT Adaptive Subband Filter Algorithm Using Wavelet Transform (웨이브렛 변환을 이용한 DCT 적응 서브 밴드 필터 알고리즘)

  • Kim, Seon-Woong;Kim, Sung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.1
    • /
    • pp.46-53
    • /
    • 1996
  • Adaptive LMS algorithm has been used in many application areas due to its low complexity. In this paper input signal is transformed into the subbands with arbitrary bandwidth. In each subbands the dynamic range can be reduced, so that the independent filtering in each subbands has faster convergence rate than the full band system. The DCT transform domain LMS adaptive filtering has the whitening effect of input signal at each bands. This leads the convergence rate to very high speed owing to the decrease of eigen value spread Finally, the filtered signals in each subbands are synthesized for the output signal to have full frequency components. In this procedure wavelet filter bank guarantees the perfect reconstruction of signal without any interspectra interference. In simulation for the case of speech signal added additive white gaussian noise, the suggested algorithm shows better performance than that of conventional NLMS algorithm at high SNR.

  • PDF

Deep neural networks for speaker verification with short speech utterances (짧은 음성을 대상으로 하는 화자 확인을 위한 심층 신경망)

  • Yang, IL-Ho;Heo, Hee-Soo;Yoon, Sung-Hyun;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.6
    • /
    • pp.501-509
    • /
    • 2016
  • We propose a method to improve the robustness of speaker verification on short test utterances. The accuracy of the state-of-the-art i-vector/probabilistic linear discriminant analysis systems can be degraded when testing utterance durations are short. The proposed method compensates for utterance variations of short test feature vectors using deep neural networks. We design three different types of DNN (Deep Neural Network) structures which are trained with different target output vectors. Each DNN is trained to minimize the discrepancy between the feed-forwarded output of a given short utterance feature and its original long utterance feature. We use short 2-10 s condition of the NIST (National Institute of Standards Technology, U.S.) 2008 SRE (Speaker Recognition Evaluation) corpus to evaluate the method. The experimental results show that the proposed method reduces the minimum detection cost relative to the baseline system.

Performance Improvement of Connected Digit Recognition by Considering Phonemic Variations in Korean Digit and Speaking Styles (한국어 숫자음의 음운변화 및 화자 발성특성을 고려한 연결숫자 인식의 성능향상)

  • 송명규;김형순
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.401-406
    • /
    • 2002
  • Each Korean digit is composed of only a syllable, so recognizers as well as Korean often have difficulty in recognizing it. When digit strings are pronounced, the original pronunciation of each digit is largely changed due to the co-articulation effect. In addition to these problems, the distortion caused by various channels and noises degrades the recognition performance of Korean connected digit string. This paper dealt with some techniques to improve recognition performance of it, which include defining a set of PLUs by considering phonemic variations in Korean digit and constructing a recognizer to handle speakers various speaking styles. In the speaker-independent connected digit recognition experiments using telephone speech, the proposed techniques with 1-Gaussian/state gave string accuracy of 83.2%, i. e., 7.2% error rate reduction relative to baseline system. With 11-Gaussians/state, we achieved the highest string accuracy of 91.8%, i. e., 4.7% error rate reduction.

Optimum Pattern Synthesis for a Microphone Array (마이크로폰 어레이를 위한 최적 패턴 형성)

  • Chang, Byoung-Kun;Kwon, Tae-Neung;Byun, Youn-Shik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.1
    • /
    • pp.47-53
    • /
    • 1997
  • This paper concerns an efficient approach to forming a beam pattern of a microphone array to deal with broadband signals such as speech in a teleconference. A numerical method is proposed to find updated location of sidelobes for equalizaing the sidelobes via perturbation of array parameters such as array weight or microphone spacing. Thus the microphone array is optimized in a Dolph-Chebyshev sense such that directional or background noises incident in an array visual range are eliminated efficiently. It is shown that perturbation of microphone spacing yields an optimum pattern more appropriate for dealing with broadband signals than that of array weight. Also, a novel method is proposed to find a beam pattern which is robust with respect to sidelobe in a scanning situation. Computer simulation results are presented.

  • PDF

Noisy Environmental Adaptation for Word Recognition System Using Maximum a Posteriori Estimation (최대사후확률 추정법을 이용한 단어인식기의 잡음환경적응화)

  • Lee, Jung-Hoon;Lee, Shi-Wook;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.2
    • /
    • pp.107-113
    • /
    • 1997
  • To achive a robust Korean word recognition system for both channel distortion and additive noise, maximum a posteriori estimation(MAP) adaptation is proposed and the effectiveness of environmental adaptation for improving recognition performance is investigated in this paper. To do this, recognition experiments using MAP adaptation are carried out for the three different speech ; 1) channel distortion is introduced, 2) environmental noise is added, 3) both channel distortion and additive noise are presented. Theeffectiveness of additive feature parameters, such as regressive coefficients and durations, for environmental adaptation are also investigated. From the speaker independent 100 words recognition tests, we had 9.0% of recognition improvement for the case 1), more than 75% for the case 2), and 11%~61.4% for the case 3) respectively, resulting that a MAP environmental adaptation is effective for both channel distorted and noise added speech recognition. But it turned out that duration information used as additive feature parameter did not played an important role in the tests.

  • PDF

A Subband Structured Digital Hearing Aid Design for Compensating Sensorineural Hearing Loss (감음성 난청 보상을 위한 부밴드 구조 디지털 보청기 설계)

  • Park Jo-Dong;Choi Hun;Bae Hveon-Deok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.5
    • /
    • pp.238-247
    • /
    • 2005
  • In this Paper. we Presents subband design techniques of a compensating filter and adaptive feedback canceller for the digital hearing aid. The sensorineural hearing loss has a hearing threshold that shows a nonlinear characteristic in frequency domain. and its compensation suffers from an echo that produced by an undesired time varying feedback path. Therefore. the digital hearing aid requires the compensator that can adjust gains nonlinearly in frequency bands and eliminate the echo rapidly In the Proposed digital hearing aid. the compensating filter is designed by the adaptive system identification method in subband structure, and the adaptive feedback canceller is designed by the subband affine projection algorithm. The designed compensation filter can control the nonlinear gain in each subband respectively, therefore precise compensation is possible. And the feedback canceller using the subband adaptive filter achieves fast convergence rate. The Performances of the Proposed method are verified by computer simulations as comparing with the behaviors of the previous trials.

Proposal of speaker change detection system considering speaker overlap (화자 겹침을 고려한 화자 전환 검출 시스템 제안)

  • Park, Jisu;Yun, Young-Sun;Cha, Shin;Park, Jeon Gue
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.466-472
    • /
    • 2021
  • Speaker Change Detection (SCD) refers to finding the moment when the main speaker changes from one person to the next in a speech conversation. In speaker change detection, difficulties arise due to overlapping speakers, inaccuracy in the information labeling, and data imbalance. To solve these problems, TIMIT corpus widely used in speech recognition have been concatenated artificially to obtain a sufficient amount of training data, and the detection of changing speaker has performed after identifying overlapping speakers. In this paper, we propose an speaker change detection system that considers the speaker overlapping. We evaluated and verified the performance using various approaches. As a result, a detection system similar to the X-Vector structure was proposed to remove the speaker overlapping region, while the Bi-LSTM method was selected to model the speaker change system. The experimental results show a relative performance improvement of 4.6 % and 13.8 % respectively, compared to the baseline system. Additionally, we determined that a robust speaker change detection system can be built by conducting related studies based on the experimental results, taking into consideration text and speaker information.

Development of the hybrid-type ultrasound speaker (하이브리드형 초음파 스피커 개발)

  • Lee, Hyoung-Sang;Kim, Bok-Kyu
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.3
    • /
    • pp.247-253
    • /
    • 2021
  • Directional ultrasonic speakers that are used to hear sound only in a specific area have been continuously researched on various improvements in terms of sound quality and cost compared to general speakers. In this paper, we propose a DSP based hybrid-type ultrasonic speaker that can be heard at the same time as a general speaker in order to compensate for the sound in the low-band range, considering that it is difficult to hear the low-band sound below 500 Hz due to the sensor characteristics of the ultrasonic speaker. In the case of the system that is implemented by simply connecting a general speaker and an ultrasonic speaker, there are issues of high cost and difficulties of control as two amplifiers are used to playback ultrasonic and general sound sources. In addition, sound quality deteriorates due to the difference in playback time between ultrasonic and general sound sources. In order to improve issues of cost, control and sound quality, we developed hybrid-type ultrasonic speaker with a DSP based amplifier that can simultaneously playback by synchronizing the general sound source with the regenerated ultrasonic sound source, in addition to implement the existing CODEC functions such as Dynamic Range Control (DRC) and Equalizer (EQ).

Wiener filtering-based ambient noise reduction technique for improved acoustic target detection of directional frequency analysis and recording sonobuoy (Directional frequency analysis and recording 소노부이의 표적 탐지 성능 향상을 위한 위너필터링 기반 주변 소음 제거 기법)

  • Hong, Jungpyo;Bae, Inyeong;Seok, Jongwon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.2
    • /
    • pp.192-198
    • /
    • 2022
  • As an effective weapon system for anti-submarine warfare, DIrectional Frequency Analysis and Recording (DIFAR) sonobuoy detects underwater targets via beamforming with three channels composed of an omni-direcitonal and two directional channels. However, ambient noise degrades the detection performance of DIFAR sonobouy in specific direction (0°, 90°, 180°, 270°). Thus, an ambient noise redcution technique is proposed for performance improvement of acoustic target detection of DIFAR sonobuoy. The proposed method is based on OTA (Order Truncate Average), which is widely used in sonar signal processing area, for ambient noise estimation and Wiener filtering, which is widely used in speech signal processing area, for noise reduction. For evaluation, we compare mean square errors of target bearing estmation results of conventional and proposed methods and we confirmed that the proposed method is effective under 0 dB signal-to-noise ratio.