• Title/Summary/Keyword: 음원 구간 검출

Search Result 10, Processing Time 0.025 seconds

Sound recognition and tracking system design using robust sound extraction section (주변 배경음에 강인한 구간 검출을 통한 음원 인식 및 위치 추적 시스템 설계)

  • Kim, Woo-Jun;Kim, Young-Sub;Lee, Gwang-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.11 no.8
    • /
    • pp.759-766
    • /
    • 2016
  • This paper is on a system design of recognizing sound sources and tracing locations from detecting a section of sound sources which is strong in surrounding environmental sounds about sound sources occurring in an abnormal situation by using signals within the section. In detection of the section with strong sound sources, weighted average delta energy of a short section is calculated from audio signals received. After inputting it into a low-pass filter, through comparison of values of the output result, a section strong in background sound is defined. In recognition of sound sources, from data of the detected section, using an HMM(: Hidden Markov Model) as a traditional recognition method, learning and recognition are realized from creating information to recognize sound sources. About signals of sound sources that surrounding background sounds are included, by using energy of existing signals, after detecting the section, compared with the recognition through the HMM, a recognition rate of 3.94% increase is shown. Also, based on the recognition result, location grasping by using TDOA(: Time Delay of Arrival) between signals in the section accords with 97.44% of angles of a real occurrence location.

Background Music Identification in TV Broadcasting Program Algorithm using Audio Peak Detection (오디오 피크 검출을 적용한 TV 방송 프로그램 내 배경음악 식별 알고리즘)

  • Lee, Jung-Sung;Kim, Hyoung-Gook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2013.06a
    • /
    • pp.34-35
    • /
    • 2013
  • 본 논문에서는 오디오 피크 검출을 적용한 TV 방송 프로그램내 배경음악 식별 알고리즘을 제안한다. 제안한 알고리즘은 음악 핑거프린트 추출 및 전송부, 음악구간 검출부, 음악 핑거프린트는 고속 매칭 및 정보전송부 세 부분으로 구성되어 있다. 음악 핑거프린트 추출 및 전송부에서는 음악 원음 오디오 데이터를 퓨리에 변환하여 스펙트럼 계수를 추출한다. 추출된 스펙트럼의 성분 중에서 일정한 문턱값 이상의 에너지를 가지는 값을 피크로 검출하고 검출된 피크를 이용하이 핑거프린트를 생성하고 데이터 베이스화한다. 음악구간 검출부에서는 입력된 방송 프로그램 오디오 데이터에 GMM(Gaussian Mixture Model)을 적용하여 음악과 음악 외 오디오 데이터를 분류한다. 음악 핑거프린트 고속 매칭 및 정보전송부에서는 음악구간이라고 인식된 쿼리 오디오 데이터를 음악 핑거프린트 추출 및 전송부와 동일한 과정을 통해 핑거프린트를 생성하고 데이터 베이스화된 음악 원음의 핑거프린트들과 비교하여 가장 유사한 음원의 정보를 TV의 화면에 자막으로 보여준다.

  • PDF

Implementation of Real-time Sound-location Tracking Method using TDoA for Smart Lecture System (스마트 강의 시스템을 위한 시간차 검출 방식의 실시간 음원 추적 기법 구현)

  • Kang, Minsoo;Oh, Woojin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.4
    • /
    • pp.708-717
    • /
    • 2017
  • Tracking of sound-location is widely used in various area such as intelligent CCTV, video conference and voice commander. In this paper we introduce the real-time sound-location tracking method for smart lecture system using TDoA(Time Difference of Arrival) with orthogonal microphone array on the ceiling. Through discussion on some models of TDoA detection, cross correlation method using linear microphone array is proposed. Orthogonal array with 5 microphone could detect omni direction of sound-location. For real-time detection we adopt the threshold of received energy for eliminating no-voice interval, signed cross correlation for reducing computational complexity. The detected azimuth angles are processed using median filter for lowering the angle deviation. The proposed system is implemented with high performance MCU of TMS320F379D and MEMs microphone module and shows the accuracy of 0.5 and 6.5 in degree for white noise and lectured voice, respectively.

Generalized cross correlation with phase transform sound source localization combined with steered response power method (조정 응답 파워 방법과 결합된 generalized cross correlation with phase transform 음원 위치 추정)

  • Kim, Young-Joon;Oh, Min-Jae;Lee, In-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.5
    • /
    • pp.345-352
    • /
    • 2017
  • We propose a methods which is reducing direction estimation error of sound source in the reverberant and noisy environments. The proposed algorithm divides speech signal into voice and unvoice using VAD. We estimate the direction of source when current frame is voiced. TDOA (Time-Difference of Arrival) between microphone array using the GCC-PHAT (Generalized Cross Correlation with Phase Transform) method will be estimated in that frame. Then, we compare the peak value of cross-correlation of two signals applied to estimated time-delay with other time-delay in time-table in order to improve the accuracy of source location. If the angle of current frame is far different from before and after frame in successive voiced frame, the angle of current frame is replaced with mean value of the estimated angle in before and after frames.

A Study of Automatic Detection of Music Signal from Broadcasting Audio Signal (방송 오디오 신호로부터 음악 신호 검출에 관한 연구)

  • Yoon, Won-Jung;Park, Kyu-Sik
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.81-88
    • /
    • 2010
  • In this paper, we proposed an automatic music/non-music signal discrimination system from broadcasting audio signal as a preliminary study of building a sound source monitoring system in real broadcasting environment. By reflecting human speech articulation characteristics, we used three simple time-domain features such as energy standard deviation, log energy standard deviation and log energy mean. Based on the experimental threshold values of each feature, we developed a rule-based algorithm to classify music portion of the input audio signal. For the verification of the proposed algorithm, actual FM broadcasting signal was recorded for 24 hours and used as source input audio signal. From the experimental results, the proposed system can effectively recognize music section with the accuracy of 96% and non-music section with that of 87%, where the performance is good enough to be used as a pre-process module for the a sound source monitoring system.

Robust Noise Detection for Digital Audio Restoration in Old Films (고전 영화의 디지털 음원 복원을 위한 강인한 노이즈 검출 기법)

  • You, Su-Jeong;Cho, Nam-Ik
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.11a
    • /
    • pp.53-54
    • /
    • 2010
  • 본 논문에서는 단일 채널 디지털 오디오 신호에서 스펙트로그램과 영상 처리 기법을 이용하여 크래클 잡음을 검출하는 알고리즘을 제안한다. 오디오 신호의 주파수 특성을 효율적으로 분석하기 위해 스펙트로그램을 특정 컬러맵을 이용하여 컬러 영상으로 변환한 후 영상 처리 기법을 적용하여 크래클 잡음이 존재하는 구간을 검출하여 디지털 오디오 복원에 이용한다. 특히 고전영화에 나타나는 크래클 잡음은 에너지와 신호 길이가 음성이나 음악 신호와 유사하여 기존의 스펙트럴 음성 검출 기법으로는 검출에 어려움이 있다. 이에 비해 스펙트로그램 영상에서는 크래클 잡음이 다른 신호들과 구분되는 특성을 나타내므로 영상 처리 기법을 적용하여 경계 검출과 Hough 변환에 의한 선 검출을 이용하여 크래클 잡음을 검출한다. 제안된 알고리즘은 고전 영화 디지털 오디오 복원에 적용하였으며 크래클 잡음 검출에 우수한 성능을 보여준다.

  • PDF

Position Estimation of Underwater Acoustic Source Using Pulsed CW Signal (Pulsed CW 신호를 사용하는 수중 음원의 위치 추정을 위한 시간지연차 추정법)

  • 최영근;손권;도경철;김기만
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.7
    • /
    • pp.514-520
    • /
    • 2004
  • There are many techniques for underwater source localization. These are the methods based on TDOA (Time Difference Of Arrival) estimation. beamforming techniques and high resolution techniques, etc. In this Paper we estimate the underwater source position using MCPSP (Modified Cross Power Spectrum Phase) function that is calculated on frequency domain using sensors of small number. However, the performances of the localizing method based on MCPSP function drops greatly in the case of CW (Continuous Wave) signal . In this Paper we proposed the TDOA estimation method for pulsed CW signal. In the Proposed method we composed of new segment including a edge of ping. This segment was computed by short-time energy detection. With theoretical representation the performances of the proposed method were analyzed under various environment.

Speech Transition Detection and approximate-synthesis Method for Speech Signal Compression and Recovery (음성신호 압축 및 복원을 위한 음성 천이구간 검출과 근사합성 방식)

  • Lee, Kwang-Seok;Kim, Bong-Gi;Kang, Seong-Soo;Kim, Hyun-Deok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.763-767
    • /
    • 2008
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech qualify in case coexist with a voiced and an unvoiced consonants in a frame. So, We proposed TS(Transition Segment) including unvoiced consonant searching and extraction method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This research present a new method of TS approximate-synthesis by using Least Mean Square and frequency band division. As a result, this method obtain a high quality approximation-synthesis waveforms within TS by using frequency information of 0.547kHz below and 2.813kHz above. The important thing is that the maximum error signal can be made with low distortion approximation-synthesis waveform within TS. This method has the capability of being applied to a new speech coding of Voiced/Silence/TS, speech analysis and speech synthesis.

  • PDF

Speech Signal Compression and Recovery Using Transition Detection and Approximate-Synthesis (천이구간 추출 및 근사합성에 의한 음성신호 압축과 복원)

  • Lee, Kwang-Seok;Lee, Byeong-Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.2
    • /
    • pp.413-418
    • /
    • 2009
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech qualify in case coexist with a voiced and an unvoiced consonants in a frame. So, We proposed TS(Transition Segment) including unvoiced consonant searching and extraction method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This research present a new method of TS approximate-synthesis by using Least Mean Square and frequency band division. As a result, this method obtain a high qualify approximation-synthesis waveforms within TS by using frequency information of 0.547kHz below and 2.813kHz above. The important thing is that the maximum error signal can be made with low distortion approximation-synthesis waveform within TS. This method has the capability of being applied to a new speech coding of Voiced/Silence/TS, speech analysis and speech synthesis.

Vocal Separation Using Selective Frequency Subtraction Considering with Energies and Phases (에너지와 위상을 고려한 선택적 주파수 차감법을 이용한 보컬 분리)

  • Kim, Hyuntae;Park, Jangsik
    • Journal of Broadcast Engineering
    • /
    • v.20 no.3
    • /
    • pp.408-413
    • /
    • 2015
  • Recently, According to increasing interest to original sound Karaoke instrument, MIDI type karaoke manufacturer attempt to make more cheap method instead of original recoding method. The specific method is to make the original sound accompaniment to remove only the voice of the singer in the singer music album. In this paper, a system to separate vocal components from music accompaniment for stereo recordings were proposed. Proposed system consists of two stages. The first stage is a vocal detection. This stage classifies an input into vocal and non vocal portions by using SVM with MFCC. In the second stage, selective frequency subtractions were performed at each frequency bin in vocal portions. In this case, it is determined in consideration not only the energies for each frequency bin but also the phase of the each frequency bin at each channel signal. Listening test with removed vocal music from proposed system show relatively high satisfactory level.