• Title/Summary/Keyword: audio signal processing

Search Result 156, Processing Time 0.032 seconds

An Efficient Guitar Chords Classification System Using Transfer Learning (전이학습을 이용한 효율적인 기타코드 분류 시스템)

  • Park, Sun Bae;Lee, Ho-Kyoung;Yoo, Do Sik
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.10
    • /
    • pp.1195-1202
    • /
    • 2018
  • Artificial neural network is widely used for its excellent performance and implementability. However, traditional neural network needs to learn the system from scratch, with the addition of new input data, the variation of the observation environment, or the change in the form of input/output data. To resolve such a problem, the technique of transfer learning has been proposed. Transfer learning constructs a newly developed target system partially updating existing system and hence provides much more efficient learning process. Until now, transfer learning is mainly studied in the field of image processing and is not yet widely employed in acoustic data processing. In this paper, focusing on the scalability of transfer learning, we apply the concept of transfer learning to the problem of guitar chord classification and evaluate its performance. For this purpose, we build a target system of convolutional neutral network (CNN) based 48 guitar chords classification system by applying the concept of transfer learning to a source system of CNN based 24 guitar chords classification system. We show that the system with transfer learning has performance similar to that of conventional system, but it requires only half the learning time.

Watermarking Algorithm for Copyright Protection of Haegeum Sound Contents (해금 사운드 콘텐츠의 저작권 보호를 위한 워터마킹 알고리듬)

  • Hong, Yeon-Woo;Kang, Myeong-Su;Cho, Sang-Jin;Chong, Ui-Pil
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.10 no.4
    • /
    • pp.214-219
    • /
    • 2009
  • This paper proposes a watermarking algorithm considering the frequency characteristics of Haegeum sounds for copyright protection of digital Haegeum sound contents. The harmonics of Haegeum sounds commonly have large magnitude values in 1500Hz~2000Hz and 2800Hz~3500Hz so that those bands are selected to embed a watermark. The proposed method computes the FFT (fast Fourier transform) of the original sound signal and embeds the watermark bits generated by PN (pseudo noise) sequence into the harmonics in the selected bands. Furthermore, the proposed method is robust to lowpass filter, bandpass filter, cropping, noise addition, MP3 compression attacks and the maximum BER (bit error rate) is 1.41% after lowpass filter attack. To measure the quality of the watermarked sound, subjective listening test, MUSHRA (multiple stimuli with hidden reference and anchor), was conducted. The mean value of MUSHRA listening test is bigger than 98 and 96.67 for every Haegeum sounds and Korean classical music with Haeguem, respectively.

  • PDF

Analysis of Power Saving Factor for a DVS Based Multimedia Processor (DVS 기반 멀티미디어 프로세서의 전력절감율 분석)

  • Kim Byoung-Il;Chang Tae-Gyu
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.1
    • /
    • pp.95-100
    • /
    • 2005
  • This paper proposes a DVS method which effectively reduces the power consumption of multimedia signal processor. Analytic derivations of effective range of its power saving factor are obtained with the assumption of a Gaussian distribution for the frame-based computational burden of the multimedia processor. A closed form equation of the power saving factor is derived in terms of the mean-standard deviation of the distribution. An MPEG-2 video decoder algorithm and AAC encoder algorithm are tested on ARM9 RISC processor for the experimental verification of the power saying of the proposed DVS approach. The experimental results with diverse MPEG-2 video and audio files show 50~30% power saving factor and show good agreement with those of the analytically derived values.

A Study on Center Speaker in Television Receiver with Sound Image Expansion (음상 확장 기능을 갖는 텔레비전 수상기에서 센터 스피커에 관한 연구)

  • 이상훈;김동수
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1231-1234
    • /
    • 1998
  • Many signal processing methods of widening the sound image for spatial impression have been studied. Most typical methods of widening the sound image are related to the phase shifting and precedence effect. However, these methods are not effective in center sound image. As listener's position moves from center to outside, the center sound image is shifted to the speaker. That is to say, the directional localization of center sound image is unstable. In this paper, we propose a television audio system including center speaker, and analyze the role of center speaker using theory of Makida and precedence effect. In experiments, we confirm the usefulness of the center speaker for the stability of center sound image.

  • PDF

A Study on Visualization of Musical Rhythm Based on Music Information Retrieval (Music Information Retrieval(MIR)을 활용한 음악적 리듬의 시각화 연구 -Onset 검출(Onset Detection) 알고리즘에 의한 시각화 어플리케이션)

  • Che, Swann
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.1075-1080
    • /
    • 2009
  • 이 글은 Music Information Retrieval(MIR) 기법을 사용하여 오디오 콘텐츠의 리듬 정보를 자동으로 분석하고 이를 시각화하는 방법에 대해 다룬다. 특히 MIR을 활용한 간단한 시각화(sound visualization) 어플리케이션을 소개함으로써 음악 정보 분석이 디자인, 시각 예술에서 다양하게 활용될 수 있음을 보이고자 한다. 음악적 정보를 시각 예술로 담아내려는 시도는 20세기 초 아방가르드 화가들에 의해 본격적으로 시작되었다. 80년대 이후에는 컴퓨터 기술의 급속한 발전으로 사운드와 이미지를 디지털 영역에서 쉽게 하나로 다룰 수 있게 되었고, 이에 따라 다양한 오디오 비주얼 예술작품들이 등장하였다. MIR은 오디오 콘텐츠로부터 음악적 정보를 분석하는 DSP(Digital Signal Processing) 기술로 최근 디지털 콘텐츠 시장의 확장과 더불어 연구가 활발히 진행되고 있다. 특히 웹이나 모바일에서는 이미 다양한 상용 어플리케이션이 적용되고 있는데 query-by-humming과 같은 음악 인식 어플리케이션이 대표적인 경우이다. 이 글에서는 onset 검출(onset detection)을 중심으로 음악적 리듬을 분석하는 알고리즘을 살펴보고 기본적인 조형원리에 따라 이를 시각화하는 어플리케이션의 예를 소개한다.

  • PDF

An Implementation of Real-Time Speaker Verification System on Telephone Voices Using DSP Board (DSP보드를 이용한 전화음성용 실시간 화자인증 시스템의 구현에 관한 연구)

  • Lee Hyeon Seung;Choi Hong Sub
    • MALSORI
    • /
    • no.49
    • /
    • pp.145-158
    • /
    • 2004
  • This paper is aiming at implementation of real-time speaker verification system using DSP board. Dialog/4, which is based on microprocessor and DSP processor, is selected to easily control telephone signals and to process audio/voice signals. Speaker verification system performs signal processing and feature extraction after receiving voice and its ID. Then through computing the likelihood ratio of claimed speaker model to the background model, it makes real-time decision on acceptance or rejection. For the verification experiments, total 15 speaker models and 6 background models are adopted. The experimental results show that verification accuracy rates are 99.5% for using telephone speech-based speaker models.

  • PDF

Active Noise Control of the Plane Wave Travelling in a Duct Using Filtered-x LMS Algorithm (Filtered-x LMS 알고리즘을 응용한 덕트내 평면파 소음의 능동제어)

  • 우재학;김인수;이정권;김광준
    • Journal of KSNVE
    • /
    • v.2 no.2
    • /
    • pp.107-116
    • /
    • 1992
  • An adaptive signal processing technique is implemented for the active noise cancellation of the plane acoustic wave propagating in a duct. To avoid the instability caused by the acoustic feedback from the control speaker to the detect microphone, an off-line modeling of the acoustic feedback plant is done using the FIR filter. Auxiliary path required for the filtered-x LMS algorithm is modeled as well. Before going into the experiments, a simulation is carried out under the same conditions with experiments. The simulation shows that the longer the length of the adaptive filter is, the better the results are achieved. Experiments have been carried out at lower audio frequency range (50 - 400Hz), and the results are in good agreements with those of simulation study. As a results of this adaptive noise control, around 50dB is reduced for a pure tone noise, and for a bandlimited noise with the bandwidth of 316Hz, a maximum of 30dB noise reduction is attained.

  • PDF

A Study on Adaptive Information Hiding Technique for Copyright Protection of Digital Images (디지털 영상물의 저작권 보호를 위한 적응적 정보 은닉 기술에 관한 연구)

  • Park, Kang-Seo;Chung, Tae-Yun;Oh, Sang-Rok;Park, Sang-Hee
    • Proceedings of the KIEE Conference
    • /
    • 1998.07g
    • /
    • pp.2427-2429
    • /
    • 1998
  • Digital watermarking is the techinque which embeds the invisible signal into multimedia data such as audio, video, images, for copyright protection, including owner identification and copy control information. This paper proposes a new watermark embedding and extraction technique by extending the direct sequence spread spectrum technique. The proposed technique approximates the frequency component of pixels in spatial domain by using Laplacian mask and adaptively embeds the watermark considering the HVS to reduce the degradation of Image. In watermark extraction process, the proposed technique strengthens the high frequency components of image and extracts the watermark by demodulation. All this processes are performed in spatial domain to reduce the processing time.

  • PDF

A Study on 2-Dimensional Sound Source Tracking System (2차원적 음원추적에 관한 연구)

  • 문성배;전승환
    • Journal of the Korean Institute of Navigation
    • /
    • v.20 no.4
    • /
    • pp.71-79
    • /
    • 1996
  • When navigating in or near an area of restricted visibility, it is necessary to be heard the whistle, bell and/or the siren of lighthouses or ships at times. Even though we can get the brief informations about the property of sound, the direction and range of a sound radiator, it is not enough to get the accurate informations for decision making. Generally the audio frequency is known as 16~20, 000Hz, but the earshot is shorten and discrimination of sound is more difficult when there is some noise. The sound pressure is 60dB at the moment when human speaks 1 meter away. Usually the noise pressures are 40dB in a silent room and 60dB on the quiet street, respectively. It this study, the basic algorithm and a method of signal processing are suggested to trace the direction and range of the source radiator using the signals received through not a physical sense but the microphone sensors.

  • PDF

An Optimization on the Psychoacoustic Model for MPEG-2 AAC Encoder (MPEG-2 AAC Encoder의 심리음향 모델 최적화)

  • Park, Jong-Tae;Moon, Kyu-Sung;Rhee, Kang-Hyeon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.38 no.2
    • /
    • pp.33-41
    • /
    • 2001
  • Currently, the compression is one of the most important technology in multimedia society. Audio files arc rapidly propagated throughout internet Among them, the most famous one is MP-3(MPEC-1 Laver3) which can obtain CD tone from 128Kbps, but tone quality is abruptly down below 64Kbps. MPEC-II AAC(Advanccd Audio Coding) is not compatible with MPEG 1, but it has high compression of 1.4 times than MP 3, has max. 7.1 and 96KHz sampling rate. In this paper, we propose an algorithm that decreased the capacity of AAC encoding computation but increased the processing speed by optimizing psychoacoustic model which has enormous amount of computation in MPEG 2 AAC encoder. The optimized psychoacoustic model algorithm was implemented by C++ language. The experiment shows that the psychoacoustic model carries out FFT(Fast Fourier Transform) computation of 3048 point with 44.1 KHz sampling rate for SMR(Signal to Masking Ratio), and each entropy value is inputted to the subband filters for the control of encoder block. The proposed psychoacoustic model is operated with high speed because of optimization of unpredictable value. Also, when we transform unpredictable value into a tonality index, the speed of operation process is increased by a tonality index optimized in high frequency range.

  • PDF