• Title/Summary/Keyword: psychoacoustic

Search Result 136, Processing Time 0.021 seconds

New Speech Enhancement Method using Psychoacoustic Criteria (심리 음향 기준을 이용한 새로운 음질 개선 방법)

  • 김대경;박장식;손경식
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.1
    • /
    • pp.56-66
    • /
    • 2001
  • The spectral subtraction algorithm using a criterion based on the human perception has been recently developed. The speech processed with Virag's algorithm sounds more pleasant to a human listener than those obtained by the classical methods. However, Virag's algorithm requires a robust voice activity detector (VAD). In the ESS (extended spectral subtraction) algorithm without VAD, the residual noise becomes more noticeable as the SNR decrease. In this paper we propose a new speech enhancement method, the combination of Wiener filter and spectral subtraction based on noise masking characteristics in the human auditory system. There is no need of VAD because the noise can be successively updated even during speech activity using Wiener filter. The adjustment of the subtraction parameter based on the masking threshold makes the residual noise inaudible. The proposed method has been compared with conventional spectral subtraction algorithms. Objective and subjective evaluation of the proposed system is performed with several noise types having different time-frequency distributions. The application of objective measures, the study of the speech spectrograms, as well as subjective listening tests, confirm that the enhanced speech with proposed algorithm is more pleasant to a human listener.

  • PDF

Sound Quality Evaluation Based on the Mahalanobis Distance for the Interior Noise of Driving Vehicles with Various the Tire Type (타이어 종류에 따른 차량 실내 소음의 Mahalanobis Distance 를 이용한 음질인덱스 구축)

  • Jeong, Jae-Eun;Yang, In-Hyung;Park, Goon-Dong;Lee, You-Yub;Oh, Jae-Eung
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.34 no.12
    • /
    • pp.1871-1876
    • /
    • 2010
  • The reduction of vehicle interior noise has been the main interest of NVH engineers. The driver's perception of the vehicle noise is strongly affected by the psychoacoustic characteristics of the noise and the SPL. The existing methods to evaluate the SQ for vehicle interior noise are linear regression analysis of subjective SQ metrics by statistics and the estimation of subjective SQ values by neural network. However, these methods strongly depend on jury tests, this leads to difficulties. To reduce the important of the jury tests, we suggest a new method using the Mahalanobis distance for SQ evaluation. And, the optimal characteristic values that influenced the results of sound quality evaluation on the basis by main effect. Finally, we developed a new method based on the MD method to evaluate sound quality. The result of noise evaluation revealed that the sound quality could be well improved by changing the structural characteristics of the vehicle.

Enhanced Pre echo Control Algorithm for MPEG Audio Coders (MPEG 오디오 부호화기를 위한 향상된 프리 에코 컨트롤 알고리듬)

  • Lee Chang-Joon;Lee Jae-Seong;Park Young-Cheol
    • Journal of Broadcast Engineering
    • /
    • v.11 no.2 s.31
    • /
    • pp.191-199
    • /
    • 2006
  • This paper presents an efficient pre echo control scheme for MPEG Audio coders based on the psychoacoustic model II (PAM-II). Pre echo control is the final step for the calculation of masking threshold in the PAM II. It is to minimize the spread of quantization error over the processing frame. In the conventional encoders, pre echo is reduced by restricting the estimated masking threshold not to exceed the one obtained in the previous frame. The conventional method performs pre echo control not only for short blocks but also for long blocks, which lowers the masking threshold in long blocks and, in turn, increases the quantization noise level of corresponding blocks. This paper proposes an efficient pre echo control process. The test result shows a mean enhancement of more than 0.4 especially for complex signals on the ITU R 5 point audio impairment scale.

Research on Open Source Encoding Technology for MPEG Unified Speech and Audio Coding (MPEG 통합 음성/오디오 코덱을 위한 오픈 소스 부호화 기술에 관한 연구)

  • Song, Jeongook;Lee, Joonil;Kang, Hong-Goo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.1
    • /
    • pp.86-96
    • /
    • 2013
  • Unified Speech and Audio Coding (USAC) is the speech/audio codec with the best quality, approved on Final Draft International Standard (FDIS) at MPEG meeting in 2011. Since MPEG conventionally standardizes only the decoder, it is not easy to study on the encoder technologies. Furthermore, Reference Model(RM) shows extremely poor performance. To solve these problems, the open source project(JAME) proposes the methods to make the improved performance of main encoder technologies in USAC. Especially, this paper introduces the encoder modules: the signal classifier for selective operation between two coders, the psychoacoustic model in frequency domain, and window transition technology. Finally, the results of verification test for FDIS and the performance of Common Encoder are appended.

Construction and Comparison of Sound Quality Index for the Vehicle HVAC System Using Regression Model and Neural Network Model (회귀모형과 신경망모형을 이용한 차량공조시스템의 음질 인덱스 구축 및 비교)

  • Park, Sang-Gil;Lee, Hae-Jin;Sim, Hyun-Jin;Lee, You-Yub;Oh, Jae-Eung
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.16 no.9 s.114
    • /
    • pp.897-903
    • /
    • 2006
  • The reduction of the vehicle interior noise has been the main interest of noise and vibration harshness (NVH) engineers. The driver's perception on the vehicle noise is affected largely by psychoacoustic characteristic of the noise as well as the SPL. In particular, the heating, ventilation and air conditioning (HVAC) system sound among the vehicle interior noise has been reflected sensitively in psychoacoustics view point. Even though the HVAC noise is not louder than overall noise level, it clearly affects subjective perception to drivers in the way of making to be nervous or annoyed. Therefore, these days a vehicle engineer takes aim at developing sound quality as well as reduction of noise. In this paper, we acquired noises in the HVAC from many vehicles. Through the objective and subjective sound quality (SQ) evaluation with acquiring noises recorded by the vehicle HVAC system, the simple and multiple regression models were obtained for the subjective evaluation 'Pleasant' using the semantic differential method (SDM). The regression procedure also allows you to produce diagnostic statistics to evaluate the regression estimates including appropriation and accuracy. Furthermore, the neural network (NN) model were obtained using three inputs(loudness, sharpness and roughness) of the SQ metrics and one output(subjective 'Pleasant'). Because human's perception is very complex and hard to estimate their pattern, we used NN model. The estimated models were compared with correlations between output indexes of SQ and hearing test results for verification data 'Pleasant'. As a result of application of the SQ indexes, the NN model was shown with the largest correlation of SQ indexes and we found possibilities to predict the SQ metrics.

Perceptual Evaluation of Noise Sources in a Chamber for Residential and Working Environment (주거 및 사무환경 챔버에서의 생활소음에 대한 감성적 평가)

  • Jeon, Jin-Yong;Kim, Kyong-Ho;Jung, Jeong-Ho;Ryu, Jong-Kwan;Cho, Moon-Jae
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.12 no.6
    • /
    • pp.437-444
    • /
    • 2002
  • This paper is to provide the basic way of a acoustical evaluation and efficient control noise by investigating the limits of perceptual loudness of living environment and by finding out any correlation between Physical characteristics of noise and psychoacoustic parameters. The limits of perceptual loudness were selected by the subjects in a chamber for residential and working environment. And the noise sources were analyzed to find out whether there is any correlation with Zwicker parameters and ACF factors. In this study especially, to set up the domestic evaluation grade about floor impact noise. we'd like to suggest the loudness Perception research result as fundamental resource for setting up the evaluation grade through the result that is based on annoyance. In the result of this research, upper limit of heavy-weight impact noise was L-60, and lower limit of it was L-50. On the other hand, upper limit of light-weight impact noise was L-70, and lower limit of it was L-55. It seemed that the loudness of noise from vacuum cleaner noise does not affect its perceived noisiness. Noises implicated In human such as floor walking noise and talking sound, are the most irritating noise in office environment.

A Blind Audio Watermarking using the Tonal Characteristic (토널 특성을 이용한 브라인드 오디오 워터마킹)

  • 이희숙;이우선
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.5
    • /
    • pp.816-823
    • /
    • 2003
  • In this paper, we propose a blind audio watermarking using the tonal characteristic. First, we explain the perceptional effect of tonal on the existed researches and shout the experimental result that tonal characteristic is more stable than other characteristics used in previous watermarking studies against several signal processing. On the base of the result, we propose the blind audio watermarking using the relation among the signals on the frequency domain which compose a tonal masker. To evaluate the sound quality of our watermarked audios, we used the SDG(Subjective Diff-Grades) and got the average SDG 0.27. This result says the watermarking using the perceptional effect of tonal is available from the viewpoint of non-perception. And we detected the watermark hits from the watermarked audios which were changed by several signal processing and the detection ratios with exception of the time shift processing were over 98%. About the time shift processing, we applied the new method that searched the most proper position on the time domain and then detected the watermark bits by the ratio of 90%.

  • PDF

Improvement of 3D Sound Using Psychoacoustic Characteristics (인간의 청각 특성을 이용한 입체음향의 방향감 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.5
    • /
    • pp.255-264
    • /
    • 2011
  • The Head Related Transfer Function (HRTF) means a process related to acoustic transmission from 3d space to the listener's ear. In other words, it contains the information that human can perceive locations of sound sources. So, we make virtual 3d sound using HRTF, despite it doesn't actually exist. But, it can deteriorate some three-dimensional effect by the confusion between front and back directions due to the non-individual HRTF depending on each listener. In this paper, we proposed the new algorithm to reduce the confusion of sound image localization using human's acoustic characteristics. The frequency spectrum and global masking threshold of 3d sounds using HRTF are used to calculate the psychoacoustical differences among each directions. And perceptible cues in each critical band are boosted to create effective 3d sound. As a result, we can make the improved 3d sound, and the performances are much better than conventional methods.

A Design and Implementation of the Real-Time MPEG-1 Audio Encoder (실시간 MPEG-1 오디오 인코더의 설계 및 구현)

  • 전기용;이동호;조성호
    • Journal of Broadcast Engineering
    • /
    • v.2 no.1
    • /
    • pp.8-15
    • /
    • 1997
  • In this paper, a real-time operating Motion Picture Experts Group-1 (MPEG-1) audio encoder system is implemented using a TMS320C31 Digital Signal Processor (DSP) chip. The basic operation of the MPEG-1 audio encoder algorithm based on audio layer-2 and psychoacoustic model-1 is first verified by C-language. It is then realized using the Texas Instruments (Tl) assembly in order to reduce the overall execution time. Finally, the actual BSP circuit board for the encoder system is designed and implemented. In the system, the side-modules such as the analog-to-digital converter (ADC) control, the input/output (I/O) control, the bit-stream transmission from the DSP board to the PC and so on, are utilized with a field programmable gate array (FPGA) using very high speed hardware description language (VHDL) codes. The complete encoder system is able to process the stereo audio signal in real-time at the sampling frequency 48 kHz, and produces the encoded bit-stream with the bit-rate 192 kbps. The real-time operation capability of the encoder system and the good quality of the decoded sound are also confirmed using various types of actual stereo audio signals.

  • PDF

A Perceptual Audio Coder Based on Temporal-Spectral Structure (시간-주파수 구조에 근거한 지각적 오디오 부호화기)

  • 김기수;서호선;이준용;윤대희
    • Journal of Broadcast Engineering
    • /
    • v.1 no.1
    • /
    • pp.67-73
    • /
    • 1996
  • In general, the high quality audio coding(HQAC) has the structure of the convertional data compression techniques combined with moodels of human perception. The primary auditory characteristic applied to HQAC is the masking effect in the spectral domain. Therefore spectral techniques such as the subband coding or the transform coding are widely used[1][2]. However no effort has yet been made to apply the temporal masking effect and temporal redundancy removing method in HQAC. The audio data compression method proposed in this paper eliminates statistical and perceptual redundancies in both temporal and spectral domain. Transformed audio signal is divided into packets, which consist of 6 frames. A packet contains 1536 samples($256{\times}6$) :nd redundancies in packet reside in both temporal and spectral domain. Both redundancies are elminated at the same time in each packet. The psychoacoustic model has been improved to give more delicate results by taking into account temporal masking as well as fine spectral masking. For quantization, each packet is divided into subblocks designed to have an analogy with the nonlinear critical bands and to reflect the temporal auditory characteristics. Consequently, high quality of reconstructed audio is conserved at low bit-rates.

  • PDF