• Title/Summary/Keyword: Audio Level

Search Result 254, Processing Time 0.021 seconds

A Study on Real-Time Loudness Metering Algorithm for Digital Broadcasting (디지털 방송용 오디오 레벨 계측 알고리즘의 실시간화 연구)

  • Park Seong-Gyoon
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.16 no.4 s.95
    • /
    • pp.427-437
    • /
    • 2005
  • In this paper, the perceived audio level metering algorithm of digital audio sound to be able to operate in real-time is proposed. Through analyzing a conventional recommendation ITU-RBS1387-I for objective audio quality analysis, FFT-based loudness metering algorithm is implemented and the real-time method of that algorithm was advised and proved. The proposed method is based on look-up table. In order to prove the proved method, using 23 pure tones and 30 preselected digital audio samples, its performance and operation time is evaluated. Its performance, compared with an original algorithm's, have a good figure of less than $2\;\%$ error even if look-up table related with spectral spreading have large level resolution of $10\;\cal{dB}$. The proposed algorithm take only 1/21 of original algorithm's measuring time. Also, in the proposed algorithm auditory pitch group energy calculation take 1/450 of original algorithm's and excitation calculation take 1/3.57. In conclusion, the proposed algorithm is expected to be implemented into DSP-based real-time loudness meter.

Salience of Envelope Interaural Time Difference of High Frequency as Spatial Feature (공간감 인자로서의 고주파 대역 포락선 양이 시간차의 유효성)

  • Seo, Jeong-Hun;Chon, Sang-Bae;Sung, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.6
    • /
    • pp.381-387
    • /
    • 2010
  • Both timbral features and spatial features are important in the assessment of multichannel audio coding systems. The prediction model, extending the ITU-R Rec. BS. 1387-1 to multichannel audio coding systems, with the use of spatial features such as ITDDist (Interaural Time Difference Distortion), ILDDist (Interaural Level Difference Distortion), and IACCDist (InterAural Cross-correlation Coefficient Distortion) was proposed by Choi et al. In that model, ITDDistswere only computed for low frequency bands (below 1500Hz), and ILDDists were computed only for high frequency bands (over 2500Hz) according to classical duplex theory. However, in the high frequency range, information in temporal envelope is also important in spatial perception, especially in sound localization. A new model to compute the ITD distortions of temporal envelopes in high frequency components is introduced in this paper to investigate the role of such ITD on spatial perception quantitatively. The computed ITD distortions of temporal envelopes in high frequency components were highly correlated with perceived sound quality of multichannel audio sounds.

Enhancement of SBR for Speech Signal Using Adaptive Noise Floor Level (가변 잡음 레벨을 이용한 음성신호에 대한 SBR 성능 항상 기술)

  • Lee, Se-Won;Oh, Seoung-Jun;Ahn, Chang-Beom;Lee, Tae-Jin;Kang, Kyoung-Ok;Park, Ho-Chong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.148-154
    • /
    • 2009
  • In audio coding, SBR technology synthesizes the high-bands using patched time-frequency information from low-bands and the correction parameters, Since SBR transmits only correction parameters for high-bands, it provides a low-rate coding of high-bands, and is used as a core module of MPEG-4 HE-AAC, SBR was originally designed for audio signal and its performance for speech signal tends to decrease, and the major reason is an excessive noise floor in high-bands which is caused by incorrect tonality computation, In this paper, a new method to determine noise floor level in an adaptive fashion according to the speech characteristics is proposed in order to solve the problem of SBR for speech signal, The proposed method maintains the compatibility with the standard SBR, and the subjective performance evaluation shows that the proposed method improves the SBR performance especially for male speech signal compared with the standard SBR.

A Study of the spatial perception by audio-visual information (시각과 청각에 의한 공간적 지각에 관한 연구)

  • Lee, Chai-Bong;Kang, Dae-Gee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.2
    • /
    • pp.132-136
    • /
    • 2010
  • Psychophysical experiment was performed to investigate how audio-visual spatial disparity affects on perceptual space in peripheral vision. In the experiment, participants were exposed to two stimuli of vision and sound which comes simultaneously from different directions, respectively. The visual stimulus was implemented by 7 white LEDs which were located at an equal distance with 7 different angles of $-70^{\circ}$, $-40^{\circ}$, $-20^{\circ}$, $0^{\circ}$, $20^{\circ}$, $40^{\circ}$, and $70^{\circ}$ from the right front. Those audial stimuli were also implemented by loudspeakers which were placed at 9 different directions equally spaced by $5^{\circ}$ ranged from $-20^{\circ}$ to $20^{\circ}$. Each participant then evaluated spatial disparity between visual and audial stimuli with 5 levels of response, in which the higher level indicates the larger gap. When the visual stimulus is applied from the right, the results show that the response level gets higher for a larger angle between visual and auditory stimuli. A similar tendency for the visual stimulus with $0^{\circ}$ orientation was also be observed. On the other hand, when the visual stimulus is applied from the left, the response level gets lower for the larger angle.

Content-based Music Information Retrieval using Pitch Histogram (Pitch 히스토그램을 이용한 내용기반 음악 정보 검색)

  • 박만수;박철의;김회린;강경옥
    • Journal of Broadcast Engineering
    • /
    • v.9 no.1
    • /
    • pp.2-7
    • /
    • 2004
  • In this paper, we proposed the content-based music information retrieval technique using some MPEG-7 low-level descriptors. Especially, pitch information and timbral features can be applied in music genre classification, music retrieval, or QBH(Query By Humming) because these can be modeling the stochasticpattern or timbral information of music signal. In this work, we restricted the music domain as O.S.T of movie or soap opera to apply broadcasting system. That is, the user can retrievalthe information of the unknown music using only an audio clip with a few seconds extracted from video content when background music sound greeted user's ear. We proposed the audio feature set organized by MPEG-7 descriptors and distance function by vector distance or ratio computation. Thus, we observed that the feature set organized by pitch information is superior to timbral spectral feature set and IFCR(Intra-Feature Component Ratio) is better than ED(Euclidean Distance) as a vector distance function. To evaluate music recognition, k-NN is used as a classifier

Auditory Model Design for Objective Audio Quality Measurement

  • Dongil Seo;Park, Se-Hyoung;Ryu, Seung-wan;Jaeho Shin
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1717-1720
    • /
    • 2002
  • Objective quality measurement schemes that in- corporate properties of the human auditory system. The basilar membrane(BM) acts as a spectrum analyzer, spatially decomposing the signal into frequency components. Each filterbank is an implementation of the ERB, gam-machirp function. This filterbank is level-dependent asymmetric compensation filters. And for the validation of the auditory model, we calculate the CPD. Quality measurement is obtained from the result.

  • PDF

The Content Based Analysis According to the Composition of the Feature Parameters for the Auditory Data (오디오 데이터의 특징 파라메터 구성에 따른 내용기반 분석)

  • 한학용;허강인;김수훈
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.2
    • /
    • pp.182-189
    • /
    • 2002
  • In this paper, we research the content-based analysis and classification according to the composition of the feature parameters pool for the auditory signals to implement the auditory indexing and searching system. Auditory data is classified to the primitive various auditory types. we described the analysis and feature extraction method for the feature parameters available to the auditory data classification. And we compose the feature parameters pool in the indexing group unit, then compare and analysis the auditory data centering around the including level and indexing criterion into the audio categories. Based on this result, we composed the classification procedure and simulate the auditory data classification.

Development of Auto Presentation System of Toolbook Using Object Auto Transition on Multimedia Authoring Tool (멀티미디어를 기반으로 하는 저작도구 툴북에서 객체 자동 변환을 이용한 자동 프리젠테이션 시스템 개발)

  • Yang, Ok-Yul;Jeong, Yeong-Sik;Lee, Yong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.5
    • /
    • pp.1182-1195
    • /
    • 1997
  • When we present some information, we can use application programs through multinedia-based authoring tools. Especially.many programers proposed to improve its intergration time and reduce programming speed and easy to use. However, multimedia based authoring tools have not all of programming methodolgies and do not supply special functions from user's request. Therefore, we have to apply effective functions through high-level programming languages.In this paper, we propose to use small appkication prograns through linking methods, So we reduce overhead from memory loading In authoring tools, we can use MCI(media control interface) call functions for playback audio files.we development ATS(Auto Transition System) for several functions-close MCI call audio files, get object status, page-to page trancition.We evidently show that an optimal configuration of presentation obtained by ATS algorithm.

  • PDF

A Human Sensibility Ergonomic Establishment of Customer-Satisfying Strategy for a Multimedia Telecommunication System (멀티미디어 통신시스템을 대상으로한 사용자 만족 전략의 감성공학적 수립)

  • Park, Min-Yong;Park, Hui-Seok
    • Journal of the Ergonomics Society of Korea
    • /
    • v.17 no.1
    • /
    • pp.23-36
    • /
    • 1998
  • The primary objective of this research was to establish and quantify the relationship between the physical degradation factors of multimedia telecommunications (teleconferencing) system and Subjective human perception. The research was performed in two stages. A field survey of the real users and pilot experiments were carried out in the first stage to determine customers' major complaints and corresponding system degradation factors. A prototype teleconferencing simulator was developed in two separate sound-treated chambers equipped with audio/video equipment running under a custom-developed software program. In the second stage, simulation experiments using the semantic differential methodology were performed utilizing 26 paid participants (14 college students and 12 housewives). The results indicated that audio/video synchronization and the frame rate were the main system factors for both subject groups, but different pattern of factors' influence was found according to the group, implying that the system configuration would hopefully accommodate the characteristics of the end users. Also, a single quality index, developed for system preference, was revealed to be highly correlated with user satisfaction. The results provide some fundamental data on the human subjective perception of multimedia telecommunications quality, and further can help establish the quality standards to enhance service level.

  • PDF