• Title/Summary/Keyword: Acoustic Problem

Search Result 453, Processing Time 0.08 seconds

Underwater object radial velocity estimation method using two different band hyperbolic frequency modulation pulses with opposite sweep directions and its performance analysis (두 대역 상반된 스윕방향 hyperbolic frequency modulation 펄스로 수중물체 시선속도추정 기법 및 성능분석)

  • Chomgun Cho;Euicheol Jeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.1
    • /
    • pp.25-31
    • /
    • 2023
  • In order to estimate the radial speed of an underwater object so-called target with active sonar, Continuous Wave (CW) pulse is generally used, but if a target is slow and at near distance, it is not easy to estimate the radial velocity of the target due to acoustic reverberation in the ocean. In 2017, Wang et al. utilized broadband signal of two Hyperbolic Frequency Modulation (HFM) pulses, which is known as a doppler-invariant pulse, with equal frequency band and in opposite sweep directions to overcome this problem and successfully estimate the radial speed of slow-moving nearby target. They demonstrated the estimation of the radial velocity with computer simulation using the parameters of two HFM starting time differences and receiving times. However, for it uses two HFM pulses with equal frequency, cross-correlation between the two pulses negatively affect the detection performance. To mitigate this cross-correlation effect, we suggest using two different band HFM with the opposite sweep directions. In this paper, a method of radial velocity estimation is derived and simulated using two HFM pulses with the pulse length of 1 second and bandwidth of 400 Hz. Applying the suggested method, the radial velocity was estimated with approximately 6 % of relative error in the simulation.

The role of voice onset time (VOT) and post-stop fundamental frequency (F0) in the perception of Tohoku Japanese stops (도호쿠 일본어의 폐쇄음 지각에 있어서 voice onset time(VOT)과 후속모음 fundamental frequency(F0)의 역할)

  • Hi-Gyung Byun
    • Phonetics and Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.35-45
    • /
    • 2023
  • Tohoku Japanese is known to have voiced stops without pre-voicing in word-initial position, whereas traditional or conservative Japanese has voiced stops with pre-voicing in the same position. One problem with this devoicing of voiced stops is that it affects the distinction between voiced and voiceless stops because their voice onset time (VOT) values overlap. Previous studies have confirmed that Tohoku speakers use post-stop fundamental frequency (F0) as an acoustic cue along with VOT to avoid overlap. However, the role of post-stop F0 as a perceptual cue in this region has barely been investigated. Therefore, this study explored the role of post-stop F0 in stop voicing perception along with VOT. Several perception tests were conducted using resynthesized stimuli, which were manipulated along a VOT continuum orthogonal to an F0 continuum. The results showed no significant regional difference (Tohoku vs. Chubu) for nonsense words (/ta-da/). However, for meaningful words (/pari/ 'Paris' vs. /bari/ 'Bali,' /piza/ 'pizza' vs. /biza/ 'visa'), a significant word effect was found, and it was confirmed that some listeners utilized the post-stop F0 more consistently and steadily than others. Based on these results, we discuss innovative listeners who may lead the change in the perception of stop voicing.

A study on the application of residual vector quantization for vector quantized-variational autoencoder-based foley sound generation model (벡터 양자화 변분 오토인코더 기반의 폴리 음향 생성 모델을 위한 잔여 벡터 양자화 적용 연구)

  • Seokjin Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.243-252
    • /
    • 2024
  • Among the Foley sound generation models that have recently begun to be studied, a sound generation technique using the Vector Quantized-Variational AutoEncoder (VQ-VAE) structure and generation model such as Pixelsnail are one of the important research subjects. On the other hand, in the field of deep learning-based acoustic signal compression, residual vector quantization technology is reported to be more suitable than the conventional VQ-VAE structure. Therefore, in this paper, we aim to study whether residual vector quantization technology can be effectively applied to the Foley sound generation. In order to tackle the problem, this paper applies the residual vector quantization technique to the conventional VQ-VAE-based Foley sound generation model, and in particular, derives a model that is compatible with the existing models such as Pixelsnail and does not increase computational resource consumption. In order to evaluate the model, an experiment was conducted using DCASE2023 Task7 data. The results show that the proposed model enhances about 0.3 of the Fréchet audio distance. Unfortunately, the performance enhancement was limited, which is believed to be due to the decrease in the resolution of time-frequency domains in order to do not increase consumption of the computational resources.