• Title/Summary/Keyword: perceptual audio

Search Result 74, Processing Time 0.022 seconds

Modified Generic Mode Coding Scheme for Enhanced Sound Quality of G.718 SWB (G.718 초광대역 코덱의 음질 향상을 위한 개선된 Generic Mode Coding 방법)

  • Cho, Keun-Seok;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.119-125
    • /
    • 2012
  • This paper describes a new algorithm for encoding spectral shape and envelope in the generic mode of G.718 super-wide band (SWB). In the G.718 SWB coder, generic mode coding and sinusoidal enhancement are used for the quantization of modified discrete cosine transform (MDCT)-based parameters in the high frequency band. In the generic mode, the high frequency band is divided into sub-bands and for every sub-band the most similar match with the selected similarity criteria is searched from the coded and envelope normalized wideband content. In order to improve the quantization scheme in high frequency region of speech/audio signals, the modified generic mode by the improvement of the generic mode in G.718 SWB is proposed. In the proposed generic mode, perceptual vector quantization of spectral envelopes and the resolution increase for spectral copy are used. The performance of the proposed algorithm is evaluated in terms of objective quality. Experimental results show that the proposed algorithm increases the quality of sounds significantly.

Audio Quality Enhancement using Perceptual Property at a Low-bitrate Compression (지각적 특성을 이용한 저 비트오율 압축 오디오 음질개선)

  • Cha Hyuk-Geun;Chae Byoung-Koog;Cha Hyung-Tai
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.275-278
    • /
    • 2004
  • 본 논문에서는 저 비트오율 압축 시 발생되는 신호 왜곡을 인간의 지각적 특성을 이용하여 음질을 개선하는 알고리즘을 제안한다. 저 비트오율 압축 과정에서 손실된 고주파 영역의 신호를 부가 정보를 사용하지 않고 손실되지 않은 영역의 정보를 사용하여 고주파 영역의 신호를 첨가함으로써 음질을 개선하였다. 비 손실 영역의 순음 및 비 순음 성분을 검출하여 손실영역에 해당 하모닉 성분을 청각 자극 에너지로 스케일 하여 새로운 신호를 첨가한다. 원 신호와 저 비트오율 압축으로 인해 왜곡된 신호, 그리고 본 논문의 알고리즘을 이용하여 개선된 신호를 신호 대 잡음 비를 측정하고 청감 테스트를 통해 음질 개선 효과를 확인하였다.

  • PDF

A Study on the Acoustic Characteristics of Sexy Voice (섹시한 음성의 음향학적 특징 연구)

  • Jeong Ok-Ran;Jo Sung-Mi
    • MALSORI
    • /
    • no.57
    • /
    • pp.73-84
    • /
    • 2006
  • The purpose of this study was to explore the acoustic characteristics of sexy voice. In this study, we measured acoustic parameters (fundamental frequency, jitter, shimmer, and nasalance) of a sustained vowel sound produced by 40 actors (20 males and 20 females) and 40 non-actors (20 males and 20 females). Digital audio recordings were made in the sustained vowel |a| for acoustic analyses using Praat (version 4.1.9) and Nasal View (version 4.5). Twenty voice pathologists participated in the listening experiment and judged the degree of sexiness on a 7-point scale. The results showed that fundamental frequency, shimmer and nasalance had significant differences between actors and non-actors. The acoustic parameters of sexy voice matched perceptual aspects of a previous study: Low fundamental frequency-low pitch and high shimmer-husky voice. On the other hand, the nasalance score did not match that of the previous study: Decreased nasalance had a higher score on sexiness scale judged by the listeners. It would be desirable to study the voice quality by analyzing and controlling more acoustic and auditory parameters for practical applications in the future.

  • PDF

Analysis of Perceptual , Cognitive, and Motoral Characteristics and their Effects on Driving Performance (운전자의 운전수행과 관련된 지각적, 인지적 특성분석 및 그 특성이 운전에 미치는 영향분석)

  • 유완석;손정현;김광석;이재식
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.7 no.6
    • /
    • pp.222-230
    • /
    • 1999
  • A fixed type driving simulator is constructed with a car body, beam projector, operation software , driving scenario, and audio equipments. With the simulator, the cognitive effects of fatigue due to two hour continuous driving of a straight road is investigated . The effects of alcohol on driving performance is also studied. The braking operation and lane keeping performance due to fatigue and alcohol are investigated. Changes of vehicle motion due to these effects are verified by computer simulation.

  • PDF

Multi Mode Harmonic Transform Coding for Speech and Music

  • Kim, Jonghark;Shin, Jae-Hyun;Lee, Insung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.3E
    • /
    • pp.101-109
    • /
    • 2003
  • A multi-mode harmonic transform coding (MMHTC) for speech and music signals is proposed. Its structure is organized as a linear prediction model with an input of harmonic and transform-based excitation. The proposed coder also utilizes harmonic prediction and an improved quantizer of excitation signal. To efficiently quantize the excitation of music signals, the modulated lapped transform(MLT) is introduced. In other words, the coder combines both the time domain (linear prediction) and the frequency domain technique to achieve the best perceptual quality. The proposed coder showed better speech quality than that of the 8 kbps QCELP coder at a bit-rate of 4 kbps.

Design of Video Quality Assurance and Integrated Quality Management System using No Reference QoE (비 참조 QoE를 이용한 영상품질 측정 및 통합품질 관리 시스템의 설계)

  • Kim, Sang-Soo;Park, Dong-Soo
    • The Journal of Information Technology
    • /
    • v.12 no.3
    • /
    • pp.49-57
    • /
    • 2009
  • This Paper provides perceptual metrics for video quality based on properties of human visual system, and audio quality based on human audition. All metrics work without reference signals, allowing non-intrusive, in-service measurements. A simple and easy-to-learn user interface displays the metrics and saves them in popular file formats like CSV. In this paper, proposed method was able to various and corrective measurement for the multimedia service video quality. As that it was able to application to set up service guide line and the methode of measurement and system for the set up standardization of the high quality video service.

  • PDF

Perception of Korean stops with a three-way laryngeal contrast

  • Kong, Eun-Jong
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.13-20
    • /
    • 2012
  • A lax stop in Korean, one of the three laryngeal contrastive stops, has undergone a sound change in terms of its acoustic properties. Prior production studies described this recent lax stop as being differentiated from tense and aspirated stops primarily by fundamental frequencies (f0). And, the acoustic property of voice onset time (VOT) further separates tense stops from lax and aspirated stops. The current research explores how these two major acoustic parameters of f0 and VOT cue the three stop categories in Korean adult listeners' perception. Thirty-one native speakers of Korean participated in two experimental tasks: categorization judgment and within-category goodness ratings. Two sets of audio stimuli were prepared by synthesizing English and Korean male speakers' CV productions. The findings showed that while f0 cues listeners to lax stops as production patterns would predict, VOT were closely related to listeners' categorization and goodness ratings of lax stops. This suggests that accurate characterizations of the recent lax stop category need to be based on Korean speakers' perceptual behavior as well as production patterns.

Digital Color Image Watermarking for HVS(Human Visual System) using Daubechies wavelet

  • Park, Jong-Tae;Rhee, Kang-Hyeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.7
    • /
    • pp.1488-1492
    • /
    • 2004
  • The digital signal has been replaced the analog signal in most of every field of multimedia including still image, animation, and audio due to the enormous extension of computer supply and the fast development of computer network. The consumers of information are able to enjoy the abundance of information because of one of the digital signal traits that very easy to regenerate the original data. Because of the trait, however, it is very hard for the producers of information to keep the copyright with the merit of original copy in quality excellency. In this paper, the watermarking technology which inserts a RGB color watermark in color image using the visual characteristics of wavelet coefficient was proposed. As a result, the PSNR value of image was varied depending on perceptual parameter, but we can obtain 32dB as a whole.

Design of the TCX module transform coefficients quantizer in AMR-WB+ codec using PVQ (PVQ 방식을 이용한 AMR-WB+ 코덱의 TCX 모듈 변환계수 양자화기 설계)

  • Park, Sang-Kuk;Park, Jung-Eun;Kang, Sang-Won
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.345-346
    • /
    • 2007
  • In this paper, we propose a Pyramid VQ(PVQ) to quantize the transform coefficients of TCX module for the music improvement of AMR-WB+ codec. The proposed PVQ is compared to the $RE_8$ Lattice VQ used in the AHR-WB+ standard codec, demonstrating improvement 4% and 5.7%, respectively, in Mean Squared Error(MSE) and 3.3% and 4.7%, respectively, in Perceptual Evaluation of Audio Quality(PEAQ) by 8-dimensional and 16-dimensional Pyramid VQ.

  • PDF

A Study on Implementation of Objective Quality Assurance System for Mobile Multimedia Video (이동 멀티미디어 영상의 객관적인 품질측정 시스템 구현에 관한 연구)

  • Paek, Seung-Eun;Ohn, Jin-Ho;Joo, Hae-Jong;Hong, Bong-Wha;Kim, Eun-Won;Park, Young-Bae
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.487-488
    • /
    • 2007
  • This Paper provides perceptual metrics for video quality based on properties of human visual system, and audio quality based on human audition. All metrics work without reference signals, allowing non-intrusive, in-service measurements. A simple and easy-to-learn user interface displays the metrics and saves them in popular file formats like CSV.

  • PDF