• Title/Summary/Keyword: perceptual distortion

Search Result 63, Processing Time 0.021 seconds

Optimal Image Quality Assessment based on Distortion Classification and Color Perception

  • Lee, Jee-Yong;Kim, Young-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.1
    • /
    • pp.257-271
    • /
    • 2016
  • The Structural SIMilarity (SSIM) index is one of the most widely-used methods for perceptual image quality assessment (IQA). It is based on the principle that the human visual system (HVS) is sensitive to the overall structure of an image. However, it has been reported that indices predicted by SSIM tend to be biased depending on the type of distortion, which increases the deviation from the main regression curve. Consequently, SSIM can result in serious performance degradation. In this study, we investigate the aforementioned phenomenon from a new perspective and review a constant that plays a big role within the SSIM metric but has been overlooked thus far. Through an experimental study on the influence of this constant in evaluating images with SSIM, we are able to propose a new solution that resolves this issue. In the proposed IQA method, we first design a system to classify different types of distortion, and then match an optimal constant to each type. In addition, we supplement the proposed method by adding color perception-based structural information. For a comprehensive assessment, we compare the proposed method with 15 existing IQA methods. The experimental results show that the proposed method is more consistent with the HVS than the other methods.

A Relevant Distortion Criterion for Interpolation of the Head-Related Transfer Functions (머리 전달 함수의 보간에 적합한 왜곡 척도)

  • Lee, Ki-Seung;Lee, Seok-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.85-95
    • /
    • 2009
  • In the binaural synthesis environments, wide varieties of the head-related transfer functions (HRTFs) that have measured with a various direction would be desirable to obtain the accurate and various spatial sound images. To reduce the size' of HRTFs, interpolation has been often employed, where the HRTF for any direction is obtained by a limited number of the representative HRTFs. In this paper, we study on the distortion measures for interpolation, which has an important role in interpolation. With lhe various objective distortion metrics, the differences between the interpolated and the measured HRTFs were computed. These were then compared and analyzed with the results from the listening tests. From the results, the objective distortion measures were selected, that reflected the perceptual differences in spatial sound image. This measure was employed in a practical interpolation technique. We applied the proposed method to four kinds of an HRTF set, measured from three human heads and one mannequin. As a result, the Mel-frequency cepstral distortion was shown to be a good predictor for the differences in spatial sound location, when three HRTF measured from human, and the time-domain signal to distortion ratio revealed good prediction results for the entire four HRTF sets.

Perceptual Color Difference based Image Quality Assessment Method and Evaluation System according to the Types of Distortion (인지적 색 차이 기반의 이미지 품질 평가 기법 및 왜곡 종류에 따른 평가 시스템 제안)

  • Lee, Jee-Yong;Kim, Young-Jin
    • Journal of KIISE
    • /
    • v.42 no.10
    • /
    • pp.1294-1302
    • /
    • 2015
  • A lot of image quality assessment metrics that can precisely reflect the human visual system (HVS) have previously been researched. The Structural SIMilarity (SSIM) index is a remarkable HVS-aware metric that utilizes structural information, since the HVS is sensitive to the overall structure of an image. However, SSIM fails to deal with color difference in terms of the HVS. In order to solve this problem, the Structural and Hue SIMilarity (SHSIM) index has been selected with the Hue, Saturation, Intensity (HSI) model as a color space, but it cannot reflect the HVS-aware color difference between two color images. In this paper, we propose a new image quality assessment method for a color image by using a CIE Lab color space. In addition, by using a support vector machine (SVM) classifier, we also propose an optimization system for applying optimal metric according to the types of distortion. To evaluate the proposed index, a LIVE database, which is the most well-known in the area of image quality assessment, is employed and four criteria are used. Experimental results show that the proposed index is more consistent with the other methods.

Video Coding Method Using Visual Perception Model based on Motion Analysis (움직임 분석 기반의 시각인지 모델을 이용한 비디오 코딩 방법)

  • Oh, Hyung-Suk;Kim, Won-Ha
    • Journal of Broadcast Engineering
    • /
    • v.17 no.2
    • /
    • pp.223-236
    • /
    • 2012
  • We develop a video processing method that allows the more advanced human perception oriented video coding. The proposed method necessarily reflects all influences by the rate-distortion based optimization and the human visual perception that is affected by the visual saliency, the limited space-time resolution and the regional moving history. For reflecting the human perceptual effects, we devise an online moving pattern classifier using the Hedge algorithm. Then, we embed the existing visual saliency into the proposed moving patterns so as to establish a human visual perception model. In order to realize the proposed human visual perception model, we extend the conventional foveation filtering method. Compared to the conventional foveation filter only smoothing less stimulus video signals, the developed foveation filter can locally smooth and enhance signals according to the human visual perception without causing any artifacts. Due to signal enhancement, the developed foveation filter more efficiently transfers the bandwidth saved at smoothed signals to the enhanced signals. Performance evaluation verifies that the proposed video processing method satisfies the overall video quality, while improving the perceptual quality by 12%~44%.

Speech Basis Matrix Using Noise Data and NMF-Based Speech Enhancement Scheme (잡음 데이터를 활용한 음성 기저 행렬과 NMF 기반 음성 향상 기법)

  • Kwon, Kisoo;Kim, Hyung Young;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.4
    • /
    • pp.619-627
    • /
    • 2015
  • This paper presents a speech enhancement method using non-negative matrix factorization (NMF). In the training phase, each basis matrix of source signal is obtained from a proper database, and these basis matrices are utilized for the source separation. In this case, the performance of speech enhancement relies heavily on the basis matrix. The proposed method for which speech basis matrix is made a high reconstruction error for noise signal shows a better performance than the standard NMF which basis matrix is trained independently. For comparison, we propose another method, and evaluate one of previous method. In the experiment result, the performance is evaluated by perceptual evaluation speech quality and signal to distortion ratio, and the proposed method outperformed the other methods.

Online blind source separation and dereverberation of speech based on a joint diagonalizability constraint (공동 행렬대각화 조건 기반 온라인 음원 신호 분리 및 잔향제거)

  • Yu, Ho-Gun;Kim, Do-Hui;Song, Min-Hwan;Park, Hyung-Min
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.503-514
    • /
    • 2021
  • Reverberation in speech signals tends to significantly degrade the performance of the Blind Source Separation (BSS) system. Especially in online systems, the performance degradation becomes severe. Methods based on joint diagonalizability constraints have been recently developed to tackle the problem. To improve the quality of separated speech, in this paper, we add the proposed de-reverberation method to the online BSS algorithm based on the constraints in reverberant environments. Through experiments on the WSJCAM0 corpus, the proposed method was compared with the existing online BSS algorithm. The performance evaluation by the Signal-to-Distortion Ratio and the Perceptual Evaluation of Speech Quality demonstrated that SDR improved from 1.23 dB to 3.76 dB and PESQ improved from 1.15 to 2.12 on average.

Digital Audio Watermarking Scheme Using Perceptual Modeling (지각 모델링을 이용한 디지털 오디오 워터마킹 방법)

  • 석종원;홍진우
    • Journal of Broadcast Engineering
    • /
    • v.6 no.2
    • /
    • pp.195-202
    • /
    • 2001
  • As a solution for copyright protection of digital multimedia contents, digital watermark technology is now drawing the attention. In this paper, we presented two novel audio watermarking algorithms as a solution for protecting unauthorized copy of digital audio. Proposed watermarking schemes include the psychoacoustic model of MPEG audio coding to achieve the perceptual transparency after watermark embedding and preprocessing procedure before correlation in watermark detection to extract copyright information without access to the original audio signal. Experimental results show that our watermarking scheme is robust to common signal Processing attacks and it Introduces no audible distortion after watermark insertion.

  • PDF

Subspace Speech Enhancement Using Subband Whitening Filter (서브밴드 백색화 필터를 이용한 부공간 잡음 제거)

  • 김종욱;유창동
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.3
    • /
    • pp.169-174
    • /
    • 2003
  • A novel subspace speech enhancement using subband whitening filter is proposed. Previous subspace speech enhancement method either assumes additive white noise or uses whitening filter as a pre-processing for colored noise. The proposed method tries to minimize the signal distortion while reducing residual noise by processing the signal using subband whitening filter. By incorporating the notion of subband whitening filter, spectral resolution in Karhunen-Loeve(KL) domain is improved with the negligible additional computational load. The proposed method outperforms both the subspace method suggested by Ephraim and the spectral subtraction suggested by Boll in terms of segmental signal-to-noise ratio (SNRseg) and perceptual evaluation of speech quality (PESQ).

Perceptual and Adaptive Quantization of Line Spectral Frequency Parameters (선 스펙트럼 주파수의 청각 적응 부호화)

  • 한우진;김은경;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.8
    • /
    • pp.68-77
    • /
    • 2000
  • Line special frequency (LSF) parameters have been widely used in low bit-rate speech coding due to their efficiency for representing the short-time speech spectrum. In this paper, a new distance measure based on the masking properties of human ear is proposed for quantizing LSF parameters whereas most conventional quantization methods are based on the weighted Euclidean distance measure. The proposed method derives the perceptual distance measure from the definition of noise-to-mask ratio (NMR) which has high correspondence with the actual distortion received in the human ear and uses it for quantizing LSF parameters. In addition, we propose an adaptive bit allocation scheme, which allocates minimal bits to LSF parameters maintaining the perceptual transparency of given speech frame for reducing the average bit-rates. For the performance evaluation, we has shown the ratio of perceptually transparent frames and the corresponding average bit-rates for the conventional and proposed methods. By jointly combining the proposed distance measure and adaptive bit allocation scheme, the proposed system requires only 770 bps for obtaining 95.5% perceptually transparent frames, while the conventional systems produce 89.9% at even 1800 bps.

  • PDF

Design of the Noise Suppressor Using the Perceptual Model and Wavelet Packet Transform (인지 모델과 웨이블릿 패킷 변환을 이용한 잡음 제거기 설계)

  • Kim, Mi-Seon;Park, Seo-Young;Kim, Young-Ju;Lee, In-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.325-332
    • /
    • 2006
  • In this paper. we Propose the noise suppressor with the Perceptual model and wavelet packet transform. The objective is to enhance speech corrupted colored or non-stationary noise. If corrupted noise is colored. subband approach would be more efficient than whole band one. To avoid serious residual noise and speech distortion, we must adjust the Wavelet Coefficient Threshold (WCT). In this Paper. the subband is designed matching with the critical band and WCT is adapted noise masking threshold (NMT) and segmental signal to noise ratio (seg_SNR). Consequently. it has similar Performance with EVRC in PESQ-MOS. But it's better than wavelet packet transform using universal threshold about 0.289 in PESQ-MOS. The important thing is that it's more useful than EVRC in coded speech. In coded speech. PESQ-MOS is higher than EVRC about 0.23.