• Title/Summary/Keyword: Spatial Audio

Search Result 90, Processing Time 0.029 seconds

A Study on Vocal Removal Scheme of SAOC Using Harmonic Information (하모닉 정보를 이용한 SAOC의 보컬 신호 제거 방법에 관한 연구)

  • Park, Ji-Hoon;Jang, Dae-Geun;Hahn, Min-Soo
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.10
    • /
    • pp.1171-1179
    • /
    • 2013
  • Interactive audio service provide with audio generating and editing functionality according to user's preference. A spatial audio object coding (SAOC) scheme is audio coding technology that can support the interactive audio service with relatively low bit-rate. However, when the SAOC scheme remove the specific one object such as vocal object signal for Karaoke mode, the scheme support poor quality because the removed vocal object remain in the SAOC-decoded background music. Thus, we propose a new SAOC vocal harmonic extranction and elimination technique to improve the background music quality in the Karaoke service. Namely, utilizing the harmonic information of the vocal object, we removed the harmonics of the vocal object remaining in the background music. As harmonic parameters, we utilize the pitch, MVF(maximum voiced frequency), and harmonic amplitude. To evaluate the performance of the proposed scheme, we perform the objective and subjective evaluation. As our experimental results, we can confirm that the background music quality is improved by the proposed scheme comparing with the SAOC scheme.

A Study of the spatial perception by audio-visual information (시각과 청각에 의한 공간적 지각에 관한 연구)

  • Lee, Chai-Bong;Kang, Dae-Gee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.2
    • /
    • pp.132-136
    • /
    • 2010
  • Psychophysical experiment was performed to investigate how audio-visual spatial disparity affects on perceptual space in peripheral vision. In the experiment, participants were exposed to two stimuli of vision and sound which comes simultaneously from different directions, respectively. The visual stimulus was implemented by 7 white LEDs which were located at an equal distance with 7 different angles of $-70^{\circ}$, $-40^{\circ}$, $-20^{\circ}$, $0^{\circ}$, $20^{\circ}$, $40^{\circ}$, and $70^{\circ}$ from the right front. Those audial stimuli were also implemented by loudspeakers which were placed at 9 different directions equally spaced by $5^{\circ}$ ranged from $-20^{\circ}$ to $20^{\circ}$. Each participant then evaluated spatial disparity between visual and audial stimuli with 5 levels of response, in which the higher level indicates the larger gap. When the visual stimulus is applied from the right, the results show that the response level gets higher for a larger angle between visual and auditory stimuli. A similar tendency for the visual stimulus with $0^{\circ}$ orientation was also be observed. On the other hand, when the visual stimulus is applied from the left, the response level gets lower for the larger angle.

Improved Phase Synthesis for Parametric Stereo Audio Coding (파라메트릭 스테레오 오디오 부호화를 위한 향상된 위상 합성 기법)

  • Hyun, Dong-Il;Park, Young-Cheol;Youn, Dae Hee
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.12
    • /
    • pp.184-190
    • /
    • 2013
  • Parametric stereo(PS) audio coding is a specific version of spatial audio coding. In this paper, the problem due to the conventional synthesis of phase differences. In the conventional upmix matrix, phase differences are synthesized not only on downmix signal but also ambient signal, which violates the assumption that the ambient signals are anti-phased. Deterioration due to the phase synthesis is analyzed, especially, for low interchannel correlation. To solve this problem, new upmix matrix is proposed, which synthesizes phase differences only on downmix signal. The performance of the proposed upmix matrix is verified by the subjective listening tests.

A DNN-Based Personalized HRTF Estimation Method for 3D Immersive Audio

  • Son, Ji Su;Choi, Seung Ho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.161-167
    • /
    • 2021
  • This paper proposes a new personalized HRTF estimation method which is based on a deep neural network (DNN) model and improved elevation reproduction using a notch filter. In the previous study, a DNN model was proposed that estimates the magnitude of HRTF by using anthropometric measurements [1]. However, since this method uses zero-phase without estimating the phase, it causes the internalization (i.e., the inside-the-head localization) of sound when listening the spatial sound. We devise a method to estimate both the magnitude and phase of HRTF based on the DNN model. Personalized HRIR was estimated using the anthropometric measurements including detailed data of the head, torso, shoulders and ears as inputs for the DNN model. After that, the estimated HRIR was filtered with an appropriate notch filter to improve elevation reproduction. In order to evaluate the performance, both of the objective and subjective evaluations are conducted. For the objective evaluation, the root mean square error (RMSE) and the log spectral distance (LSD) between the reference HRTF and the estimated HRTF are measured. For subjective evaluation, the MUSHRA test and preference test are conducted. As a result, the proposed method can make listeners experience more immersive audio than the previous methods.

Interactive Spatial Augmented Reality Book on Cultural Heritage of Myanmar

  • Hta, Aye Chan Zay;Lee, Yunli
    • Journal of information and communication convergence engineering
    • /
    • v.18 no.2
    • /
    • pp.69-74
    • /
    • 2020
  • Myanmar, also known as Burma, has a rich cultural heritage, and its historical tourist attractions well known around the world. Therefore, we designed and developed an interactive spatial augmented reality (iSAR) book on the cultural heritage of Myanmar. This iSAR book has total of 18 pages with rich media content including videos, animations, audio, and images featuring the cultural heritage of Myanmar in a digital format. In addition to virtual content, navigational features such as virtual buttons and touch-based hand gestures were implemented using Leap Motion and VVVV. Therefore, the developed iSAR book allows virtual content and navigational features to merge seamlessly into a physical book. Five participants were recruited to evaluate the prototype iSAR book, and interviews were conducted to gather their feedback based on its immersive qualities. Thus, the developed iSAR book on Myanmar effectively shares the cultural heritage of Myanmar, and ultimately allows users to explore and gain more insight into the country.

Personal monitor & TV audio system by using speaker array (스피커 어레이를 이용한 개인용 모니터와 TV 오디오 시스템)

  • Lee, Chan-Hui;Chang, Ji-Ho;Park, Jin-Young;Kim, Yang-Hann
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2007.11a
    • /
    • pp.638-643
    • /
    • 2007
  • With development of high display quality of TV and Monitor, personal audio system is arising great interest. In this study, we applied a method to make a good bright zone around the user and dark zone to other region by maximizing the ratio of sound energy between the bright and dark zone. We have attempted to use a line speaker array system to localize the sound in our listening zone. It depends on the size of the zone and array parameters, for example, array size, speaker spacing, wave length of sound.

  • PDF

Improved Channel Level Difference Quantization for Spatial Audio Coding

  • Kim, Kwang-Ki;Beack, Seung-Kwon;Seo, Jeong-Il;Jang, Dae-Young;Hahn, Min-Soo
    • ETRI Journal
    • /
    • v.29 no.1
    • /
    • pp.99-102
    • /
    • 2007
  • The channel level difference (CLD) is a main parameter in the reference model 0 (RM0) for MPEG Surround. Nevertheless, the CLD quantization method in the RM0 has problems such as the lack of theoretical background and inappropriate quantization levels. In this letter, a new CLD quantization method is proposed based on the virtual source location information which has strength in the quantization process. From experimental results, it is confirmed that the proposed scheme greatly reduces the quantization distortions measured in dB and degrees without any additional complexity.

  • PDF

A research of UHD audio converting system based on a spatial audio coding (공간오디오 코딩기법을 사용한 UHD 오디오 변환 시스템에 대한 연구)

  • Cho, Choongsang;Lee, Yong Han;Kim, Jewoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.07a
    • /
    • pp.425-428
    • /
    • 2015
  • 본 논문에서는 다양한 멀티 채널 오디오 규격들을 설명하고, 스테레오와 5.1 채널과 같이 기존에 많이 사용되고 있는 오디오 시스템 구조와 UHD 오디오 채널 시스템이 호환되기 위한 구조를 제안한다. 제안된 구조는 두 채널을 공간 오디오 코딩 기법으로 한 채널 오디오 신호화 공간 파라미터를 출력하는 구조를 기반으로 셋 채널을 한 채널로 변환하는 모듈과, 넷 채널을 한 채널로 변환하는 시스템을 설계한다. 이렇게 설계된 변환 모듈을 이용하여, 22.2 채널을 10.2 채널로 변환하기 위한 구조와 10.2 채널을 5.1 채널로 변환하기 위한 다채널 오디오 변환 시스템을 설계한다. 설계된 다채널 오디오 변환 구조를 실험하기 위하여 22.2 채널 오디오를 스테레오와 공간 파라미터로 변환하고, 다시 스테레오와 공간 파라미터를 이용하여 22.2 채널로 복원한 후 해당 채널에 대한 비교를 수행한다. 실험에서 보이는 바와 같이 스테레오와 공간 파라미터로부터 본원 된 경우임에도 불구하고 원음에 매우 유사한 파형의 결과를 얻을 수 있다.

  • PDF

The Application of Digital Watermarking in Remote Sensing Image

  • Jin, Peidong;Qin, Xuwen
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1264-1267
    • /
    • 2003
  • To protect the digital image, video and audio from non-authorized use, the digital watermarking technology has received a great attention in the field of multimedia in recent years . An overview of the development of watermark techniques is given in the current paper followed by a discussion of potential application of spatial domain, transform domain watermark techniques in remote sensing images copyright protection and verification in different forms of processed images.

  • PDF

Joint Channel Coding Based on Principal Component Analysis

  • Hyun, Dong-Il;Lee, Dong-Geum;Park, Young-Cheol;Youn, Dae-Hee;Seo, Jeong-Il
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.831-834
    • /
    • 2010
  • This paper proposes a new joint channel coding algorithm based on principal component analysis. A conventional joint channel coder using passive downmixing undergoes a reduction of both the primary-to-ambient energy ratio (PAR) of the downmix signal and the panning gain ratio of the primary source. The proposed system preserves the PAR of the downmix signal by using active downmixing which reflects spatial characteristic. The proposed system also improves the accuracy of the panning gain ratio estimation. Computer simulations and subjective listening tests verify the performance of the proposed system.