• Title/Summary/Keyword: Audio Effect

Search Result 184, Processing Time 0.02 seconds

A Study on Immersive Audio Improvement of FTV using an effective noise (유효 잡음을 활용한 FTV 입체음향 개선방안 연구)

  • Kim, Jong-Un;Cho, Hyun-Seok;Lee, Yoon-Bae;Yeo, Sung-Dae;Kim, Seong-Kweon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.2
    • /
    • pp.233-238
    • /
    • 2015
  • In this paper, we proposed that immersive audio effect method using the effective noise to improve engagement in free-viewpoint TV(FTV) service. In the basketball court, we monitored the frequency spectrums by acquiring continuous audio data of players and referee using shotgun and wireless microphone. By analyzing this spectrum, in case that users zoomed in, we determined whether it is effective frequency or not. Therefore when users using FTV service zoom in toward the object, it is proposed that we need to utilize unnecessary noise instead of removing that. it will be able to be useful for an immersive audio implementation of FTV.

Audio-Visual Content Analysis Based Clustering for Unsupervised Debate Indexing (비교사 토론 인덱싱을 위한 시청각 콘텐츠 분석 기반 클러스터링)

  • Keum, Ji-Soo;Lee, Hyon-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.5
    • /
    • pp.244-251
    • /
    • 2008
  • In this research, we propose an unsupervised debate indexing method using audio and visual information. The proposed method combines clustering results of speech by BIC and visual by distance function. The combination of audio-visual information reduces the problem of individual use of speech and visual information. Also, an effective content based analysis is possible. We have performed various experiments to evaluate the proposed method according to use of audio-visual information for five types of debate data. From experimental results, we found that the effect of audio-visual integration outperforms individual use of speech and visual information for debate indexing.

Multichannel Audio Reproduction Technology based on 10.2ch for UHDTV (UHDTV를 위한 10.2 채널 기반 다채널 오디오 재현 기술)

  • Lee, Tae-Jin;Yoo, Jae-Hyoun;Seo, Jeong-Il;Kang, Kyeong-Ok;Kim, Whan-Woo
    • Journal of Broadcast Engineering
    • /
    • v.17 no.5
    • /
    • pp.827-837
    • /
    • 2012
  • As broadcasting environments change rapidly to digital, user requirements for next-generation broadcasting service which surpass current HDTV service become bigger and bigger. The next-generation broadcasting service progress from 2D to 3D, from HD to UHD and from 5.1ch audio to more than 10ch audio for high quality realistic broadcasting service. In this paper, we propose 10.2ch based multichannel audio reproduction system for UHDTV. The 10.2ch-based audio reproduction system add two side loudspeakers to enhance the surround sound localization effect and add two height and one ceiling loudspeakers to enhance the elevation localization effect. To evaluate the proposed system, we used APM(Auditory Process Model) for objective localization test and conducted subjective localization test. As a result of objective/subjective localization test, the proposed system shows the statistically same performance compare with 22.2ch audio system and shows the significantly better performance compared with 5.1ch audio system.

A Study on the Extension of the Description Elements for Audio-visual Archives (시청각기록물의 기술요소 확장에 관한 연구)

  • Nam, Young-Joon;Moon, Jung-Hyun
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.21 no.4
    • /
    • pp.67-80
    • /
    • 2010
  • The output and usage rate of audio-visual materials have sharply increased as the information industry advances and diverse archives became available. However, the awareness of the audio-visual archives are more of a separate record with collateral value. The organizations that hold these materials have very weak system of the various areas such as the categories and archiving methods. Moreover, the management system varies among the organizations, so the users face difficulty retrieving and utilizing the audio-visual materials. Thus, this study examined the feasibility of the synchronized management of audio-visual archives by comparing the descriptive elements of the audio-visual archives in internal key agencies. The study thereby examines the feasibility of the metadata element of the organizations and that of synchronized management to propose the effect of the use of management, retrieval and service of efficient AV materials. The study also proposes the improvement of descriptive element of metadata.

Analysis and Synthesis of Audio Signals using a Sinusoidal Model with Psychoacoustic Criteria (정현파 모델을 이용한 오디오 신호의 심리음향적 분석 및 합성)

  • 남승현;강경옥;홍진우
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.2
    • /
    • pp.77-82
    • /
    • 1999
  • A sinusoidal model has been widely used in the analysis and synthesis of speech and audio signals, and becomes one of the efficient candidates for high quality low bit rate audio coders. One of the crucial steps in the analysis and synthesis using a sinusoidal model is the detection of tonal components. This paper proposes an efficient method for the analysis and synthesis of audio signals using a sinusoidal model, which uses psychoacoustic criteria such as masking effect, masking index, and JNDf(Just Noticeable Difference in Frequency). Simulation results show that the proposed method reduces the number of sinusoids significantly without degrading the quality of the synthesized audio signals.

  • PDF

Implementation of the High-Quality Audio System with the Separately Processed Musical Instrument Channels (악기별 분리처리를 통한 고음질 오디오 시스템 구현)

  • Kim, Tae-Hoon;Lee, Sang-Hak;Kim, Dae-Kyung;Lee, Sang-Chan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.4
    • /
    • pp.346-353
    • /
    • 2013
  • This paper deals with the implementation of a high-quality audio system for karaoke. For improving the key/tempo changes performance, we separated the audio into many musical instrument channels. By separating musical instrument channels, high-quality key/tempo changes can be achieved and we confirmed this using the cross-correlation distribution and the MOS evaluation. The improved audio system was implemented using the TMS320C6747 DSP with fixed/floating-point operations. The implemented audio system can perform the multi-channel WMA decoding, the MP3 encoding/decoding, the wav playing, the EQ, and the key/tempo changes in real time. The WMA channels used for processing the separated instrument channels. The audio system includs the MP3 encoding/decoding function for playing and recording and the wav channel for the effect sound.

Comparison of environmental sound classification performance of convolutional neural networks according to audio preprocessing methods (오디오 전처리 방법에 따른 콘벌루션 신경망의 환경음 분류 성능 비교)

  • Oh, Wongeun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.3
    • /
    • pp.143-149
    • /
    • 2020
  • This paper presents the effect of the feature extraction methods used in the audio preprocessing on the classification performance of the Convolutional Neural Networks (CNN). We extract mel spectrogram, log mel spectrogram, Mel Frequency Cepstral Coefficient (MFCC), and delta MFCC from the UrbanSound8K dataset, which is widely used in environmental sound classification studies. Then we scale the data to 3 distributions. Using the data, we test four CNNs, VGG16, and MobileNetV2 networks for performance assessment according to the audio features and scaling. The highest recognition rate is achieved when using the unscaled log mel spectrum as the audio features. Although this result is not appropriate for all audio recognition problems but is useful for classifying the environmental sounds included in the Urbansound8K.

A Blind Audio Watermarking using the Tonal Characteristic (토널 특성을 이용한 브라인드 오디오 워터마킹)

  • 이희숙;이우선
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.5
    • /
    • pp.816-823
    • /
    • 2003
  • In this paper, we propose a blind audio watermarking using the tonal characteristic. First, we explain the perceptional effect of tonal on the existed researches and shout the experimental result that tonal characteristic is more stable than other characteristics used in previous watermarking studies against several signal processing. On the base of the result, we propose the blind audio watermarking using the relation among the signals on the frequency domain which compose a tonal masker. To evaluate the sound quality of our watermarked audios, we used the SDG(Subjective Diff-Grades) and got the average SDG 0.27. This result says the watermarking using the perceptional effect of tonal is available from the viewpoint of non-perception. And we detected the watermark hits from the watermarked audios which were changed by several signal processing and the detection ratios with exception of the time shift processing were over 98%. About the time shift processing, we applied the new method that searched the most proper position on the time domain and then detected the watermark bits by the ratio of 90%.

  • PDF

A Perceptual Audio Coder Based on Temporal-Spectral Structure (시간-주파수 구조에 근거한 지각적 오디오 부호화기)

  • 김기수;서호선;이준용;윤대희
    • Journal of Broadcast Engineering
    • /
    • v.1 no.1
    • /
    • pp.67-73
    • /
    • 1996
  • In general, the high quality audio coding(HQAC) has the structure of the convertional data compression techniques combined with moodels of human perception. The primary auditory characteristic applied to HQAC is the masking effect in the spectral domain. Therefore spectral techniques such as the subband coding or the transform coding are widely used[1][2]. However no effort has yet been made to apply the temporal masking effect and temporal redundancy removing method in HQAC. The audio data compression method proposed in this paper eliminates statistical and perceptual redundancies in both temporal and spectral domain. Transformed audio signal is divided into packets, which consist of 6 frames. A packet contains 1536 samples($256{\times}6$) :nd redundancies in packet reside in both temporal and spectral domain. Both redundancies are elminated at the same time in each packet. The psychoacoustic model has been improved to give more delicate results by taking into account temporal masking as well as fine spectral masking. For quantization, each packet is divided into subblocks designed to have an analogy with the nonlinear critical bands and to reflect the temporal auditory characteristics. Consequently, high quality of reconstructed audio is conserved at low bit-rates.

  • PDF

A Real Time 6 DoF Spatial Audio Rendering System based on MPEG-I AEP (MPEG-I AEP 기반 실시간 6 자유도 공간음향 렌더링 시스템)

  • Kyeongok Kang;Jae-hyoun Yoo;Daeyoung Jang;Yong Ju Lee;Taejin Lee
    • Journal of Broadcast Engineering
    • /
    • v.28 no.2
    • /
    • pp.213-229
    • /
    • 2023
  • In this paper, we introduce a spatial sound rendering system that provides 6DoF spatial sound in real time in response to the movement of a listener located in a virtual environment. This system was implemented using MPEG-I AEP as a development environment for the CfP response of MPEG-I Immersive Audio and consists of an encoder and a renderer including a decoder. The encoder serves to offline encode metadata such as the spatial audio parameters of the virtual space scene included in EIF and the directivity information of the sound source provided in the SOFA file and deliver them to the bitstream. The renderer receives the transmitted bitstream and performs 6DoF spatial sound rendering in real time according to the position of the listener. The main spatial sound processing technologies applied to the rendering system include sound source effect and obstacle effect, and other ones for the system processing include Doppler effect, sound field effect and etc. The results of self-subjective evaluation of the developed system are introduced.