• Title/Summary/Keyword: perceptual audio

Search Result 74, Processing Time 0.017 seconds

A Perception Based Active Matrix Decoder with Virtual Source Location Information (가상 음원 위치 정보를 이용한 능동 메트릭스 디코더)

  • Moon, Han-Gil
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.18-24
    • /
    • 2010
  • In this paper, a new matrix decoding system using vector based Virtual Source Location Information (VSLI) is proposed as an alternative to the conventional Dolby Pro logic II/IIx system for reconstructing multi-channel output signals from matrix encoded two channel signals, Lt/Rt. This new matrix decoding system is composed of passive decoding part and active part. The passive part makes crude multi-channel signals using linear combination of the two encoded signals(Lt/Rt) and the active part enhances each channel regarding to the virtual source which is emergent in each inter channel. Since the virtual sources are related to the perceptual sound images in virtual sound field, the reconstructed multi-channel sound results in good dynamic perception and stable image localization. Moreover, the good channel separation is maintained with nonlinear trigonometric enhancing function.

Quality Improvement of Low Bitrate HE-AAC using Linear Prediction Pre-processor (저 전송률 환경에서 선형예측 전처리기를 사용한 HE-AAC의 성능 향상)

  • Lee, Jae-Seong;Lee, Gun-Woo;Park, Young-Chul;Youn, Dae-Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.8C
    • /
    • pp.822-829
    • /
    • 2009
  • This paper proposes a new method of improving the quality of High Efficiency Advanced Audio Coding (HE-AAC). HE-AAC encodes input source by allocating bits for each scalefactor bands appropriately according to human ear's psychoacoustic property. As a result, insufficient bits are assigned to the bands which have relatively low energy. This imbalance between different energy bands can cause decreasing of sound quality like musical noise. In the proposed system, a Linear Prediction (LP) module is combined with HE-AAC as a pre-processor to improve sound quality by even bits distribution. To apply accurate human being's psychoacoustic property, the psychoacoustic model uses Fast Fourier Transform (FFT) spectrum of original input signal to make masking threshold. In its implementation, masking threshold of psychoacoustic model is normalized using the LP spectral envelope in prior to quantization of the LP residual. Experimental result shows that, the proposed algorithm allocates bits appropriately for insufficient bits condition and improves the performance of HE-AAC.

H.263-Based Scalable Video Codec (H.263을 기반으로 한 확장 가능한 비디오 코덱)

  • 노경택
    • Journal of the Korea Society of Computer and Information
    • /
    • v.5 no.3
    • /
    • pp.29-32
    • /
    • 2000
  • Layered video coding schemes allow the video information to be transmitted in multiple video bitstreams to achieve scalability. they are attractive in theory for two reasons. First, they naturally allow for heterogeneity in networks and receivers in terms of client processing capability and network bandwidth. Second, they correspond to optimal utilization of available bandwidth when several video qualify levels are desired. In this paper we propose a scalable video codec architectures with motion estimation, which is suitable for real-time audio and video communication over packet networks. The coding algorithm is compatible with ITU-T recommendation H.263+ and includes various techniques to reduce complexity. Fast motion estimation is Performed at the H.263-compatible base layer and used at higher layers, and perceptual macroblock skipping is performed at all layers before motion estimation. Error propagation from packet loss is avoided by Periodically rebuilding a valid Predictor in Intra mode at each layer.

  • PDF

Real data-based active sonar signal synthesis method (실데이터 기반 능동 소나 신호 합성 방법론)

  • Yunsu Kim;Juho Kim;Jongwon Seok;Jungpyo Hong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.1
    • /
    • pp.9-18
    • /
    • 2024
  • The importance of active sonar systems is emerging due to the quietness of underwater targets and the increase in ambient noise due to the increase in maritime traffic. However, the low signal-to-noise ratio of the echo signal due to multipath propagation of the signal, various clutter, ambient noise and reverberation makes it difficult to identify underwater targets using active sonar. Attempts have been made to apply data-based methods such as machine learning or deep learning to improve the performance of underwater target recognition systems, but it is difficult to collect enough data for training due to the nature of sonar datasets. Methods based on mathematical modeling have been mainly used to compensate for insufficient active sonar data. However, methodologies based on mathematical modeling have limitations in accurately simulating complex underwater phenomena. Therefore, in this paper, we propose a sonar signal synthesis method based on a deep neural network. In order to apply the neural network model to the field of sonar signal synthesis, the proposed method appropriately corrects the attention-based encoder and decoder to the sonar signal, which is the main module of the Tacotron model mainly used in the field of speech synthesis. It is possible to synthesize a signal more similar to the actual signal by training the proposed model using the dataset collected by arranging a simulated target in an actual marine environment. In order to verify the performance of the proposed method, Perceptual evaluation of audio quality test was conducted and within score difference -2.3 was shown compared to actual signal in a total of four different environments. These results prove that the active sonar signal generated by the proposed method approximates the actual signal.