• Title/Summary/Keyword: Audio Analysis

Search Result 536, Processing Time 0.02 seconds

Automatic Indexing Algorithm of Golf Video Using Audio Information (오디오 정보를 이용한 골프 동영상 자동 색인 알고리즘)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.441-446
    • /
    • 2009
  • This paper proposes an automatic indexing algorithm of golf video using audio information. In the proposed algorithm, the input audio stream is demultiplexed into the stream of video and audio. By means of Adaboost-cascade classifier, the continuous audio stream is classified into announcer's speech segment recorded in studio, music segment accompanied with players' names on TV screen, reaction segment of audience according to the play, reporter's speech segment with field background, filed noise segment like wind or waves. And golf swing sound including drive shot, iron shot, and putting shot is detected by the method of impulse onset detection and modulation spectrum verification. The detected swing and applause are used effectively to index action or highlight unit. Compared with video based semantic analysis, main advantage of the proposed system is its small computation requirement so that it facilitates to apply the technology to embedded consumer electronic devices for fast browsing.

The Effect of Reminiscence with Audio-Visual Stimulation on Senile Dementia (치매노인에게 시청각 자극을 병행한 회상요법의 적용효과)

  • 김남초;유양숙;한숙원
    • Journal of Korean Academy of Nursing
    • /
    • v.30 no.1
    • /
    • pp.98-109
    • /
    • 2000
  • The purpose of this study was to identify the effect on improvement of the Activity of Daily Living (ADL) and decrease the cognitive function and agitation behaviors by reminiscence with audio-visual stimulation for senile dementia. The quasi-experimental design was used in this study. Subjects were 26 with mild senile dementia who were cared for at a Day Care Center for Dementia in Seoul. The data were collected from March to July, 1999. Subjects were divided into three groups : Control Igroup with 10 subjects, reminiscence group(Control II group with 8 subjects), and reminiscence with audio-visual stimulation group(experimental group with 8 subjects). The Control I group got routine care as usual. Control II group participated in reminiscence sessions for one hour a day, five times a week , for a period of 4 weeks. The experimental group participated in reminiscence with audio-visual stimulation sessions for one hour a day, five times a week, for a period of 4 weeks. Instruments of this study were color photography with sound that was developed through an open questionnaire about events, objects, humans in action and animals that 100 Korean elderly over 60 would like to memorize. This was referred from the Sensory Stimuli Package by Namazi and Haynes(1994). The effects of treatment was evaluated through MMSE-K by Kwon & Park(1989). Also the Brief Cognitive Rating Scale(BCRS) by Reisberg et al(1983) for the cognitive function, through Agitation Inventory by Cohen- Mansfield and Colleague(1989) for behavioral response and through the Rapid Disability Rating Scale-2(RDRS-2) by Linn & Linn(1982) for the activity of daily living respectively. Data analysis was done using SPSS for $\chi$2- test, ANOVA, repeated measures ANOVA. The results were as follows : 1. Reminiscence with audio-visual stimulation did not improve cognitive function for senile dementia, but significantly improved verbal expression, the subscale of cognitive function. 2. Reminiscence with audio-visual stimulation reduced agitation behavior of experimental group significantly, but there was no significant difference between groups. 3. Reminiscence with audio-visual stimulation did not significantly effect the activity of daily living after treatment. In conclusion, it was shown that the reminiscence with audio-visual stimulation was an effective therapy to improve verbal expression and to reduce agitation behaviors of senile dementia. Further research with more indepth approach is needed, considering characteristic and level individualized for each senile dementia.

  • PDF

MPEG-H 3D Audio Decoder Structure and Complexity Analysis (MPEG-H 3D 오디오 표준 복호화기 구조 및 연산량 분석)

  • Moon, Hyeongi;Park, Young-cheol;Lee, Yong Ju;Whang, Young-soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.432-443
    • /
    • 2017
  • The primary goal of the MPEG-H 3D Audio standard is to provide immersive audio environments for high-resolution broadcasting services such as UHDTV. This standard incorporates a wide range of technologies such as encoding/decoding technology for multi-channel/object/scene-based signal, rendering technology for providing 3D audio in various playback environments, and post-processing technology. The reference software decoder of this standard is a structure combining several modules and can operate in various modes. Each module is composed of independent executable files and executed sequentially, real time decoding is impossible. In this paper, we make DLL library of the core decoder, format converter, object renderer, and binaural renderer of the standard and integrate them to enable frame-based decoding. In addition, by measuring the computation complexity of each mode of the MPEG-H 3D-Audio decoder, this paper also provides a reference for selecting the appropriate decoding mode for various hardware platforms. As a result of the computational complexity measurement, the low complexity profiles included in Korean broadcasting standard has a computation complexity of 2.8 times to 12.4 times that of the QMF synthesis operation in case of rendering as a channel signals, and it has a computation complexity of 4.1 times to 15.3 times of the QMF synthesis operation in case of rendering as a binaural signals.

LED Emotional Lighting Algorithm and Application using Audio Spectrum (오디오 스펙트럼을 이용한 LED 감성 조명 알고리즘과 응용)

  • Jang, Young-Beom;Seok, Sang-Chul
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.10B
    • /
    • pp.1252-1257
    • /
    • 2011
  • In this paper, efficient functions for audio spectrum mapping with visible spectrum are proposed. Through mapping overall hearing frequency band with visible frequency band, emotional lighting might be possible. We propose a basic linear mapping function and non-linear mapping functions emphasizing specific audio frequency bands. For the algorithm implementation, spectrum analysis method and filter method are introduced. Especially, in this paper, a prototype LED lighting equipment using the digital filter method is implemented. The proposed lighting method can be applied to many LED lighting area using music.

DCT and DWT Based Robust Audio Watermarking Scheme for Copyright Protection

  • Deb, Kaushik;Rahman, Md. Ashikur;Sultana, Kazi Zakia;Sarker, Md. Iqbal Hasan;Chong, Ui-Pil
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.1
    • /
    • pp.1-8
    • /
    • 2014
  • Digital watermarking techniques are attracting attention as a proper solution to protect copyright for multimedia data. This paper proposes a new audio watermarking method based on Discrete Cosine Transformation (DCT) and Discrete Wavelet Transformation (DWT) for copyright protection. In our proposed watermarking method, the original audio is transformed into DCT domain and divided into two parts. Synchronization code is applied on the signal in first part and 2 levels DWT domain is applied on the signal in second part. The absolute value of DWT coefficient is divided into arbitrary number of segments and calculates the energy of each segment and middle peak. Watermarks are then embedded into each middle peak. Watermarks are extracted by performing the inverse operation of watermark embedding process. Experimental results show that the hidden watermark data is robust to re-sampling, low-pass filtering, re-quantization, MP3 compression, cropping, echo addition, delay, and pitch shifting, amplitude change. Performance analysis of the proposed scheme shows low error probability rates.

An advertisement method using inaudible sound of speaker

  • Chung, Myoungbeom
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.8
    • /
    • pp.7-13
    • /
    • 2015
  • Recently, there are serviced user customized advertisement of various type using smart device. Representative services are advertisement service using light of smart TV screen or audible sound of smart TV to transmit advertisement information. However, those services have to do a specific action of smart device user for advertisement information or need audible audio information of TV contents. To overcome those weakness, therefore, we propose an advertisement method using inaudible sound of speaker based on smart device. This method supports the transfer of advertising content to the smart device user with no additional action or TV audio signal required to access that content. The proposed method used two high frequencies among 18kHz ~ 22kHz of audible frequency range which smart TV can send out. And it generates those frequencies synthesized with audio of TV contents as trigger signal which can send advertisements to smart device. Next, smart device analysis the trigger signal and request advertisement contents related to the signal to server. After then, smart device can show the downloaded contents to user. Because the proposed method uses the high frequencies of sound signals via the inner speaker of the smart device, its main advantage is that it does not affect the audio signal of TV content. To evaluate the efficacy of the proposed method, we developed an application to implement it and subsequently carried out an advertisement transmission experiment. The success rate of the transmission experiment was approximately 97%. Based on this result, we believe the proposed method will be a useful technique in introducing a customized user advertising service.

Similar Movie Retrieval using Low Peak Feature and Image Color (Low Peak Feature와 영상 Color를 이용한 유사 동영상 검색)

  • Chung, Myoung-Beom;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.8
    • /
    • pp.51-58
    • /
    • 2009
  • In this paper. we propose search algorithm using Low Peak Feature of audio and image color value by which similar movies can be identified. Combing through entire video files for the purpose of recognizing and retrieving matching movies requires much time and memory space. Moreover, these methods still share a critical problem of erroneously recognizing as being different matching videos that have been altered only in resolution or converted merely with a different codec. Thus we present here a similar-video-retrieval method that relies on analysis of audio patterns, whose peak features are not greatly affected by changes in the resolution or codec used and image color values. which are used for similarity comparison. The method showed a 97.7% search success rate, given a set of 2,000 video files whose audio-bit-rate had been altered or were purposefully written in a different codec.

Classification of pathological and normal voice based on dimension reduction of feature vectors (피처벡터 축소방법에 기반한 장애음성 분류)

  • Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.123-126
    • /
    • 2007
  • This paper suggests a method to improve the performance of the pathological/normal voice classification. The effectiveness of the mel frequency-based filter bank energies using the fisher discriminant ratio (FDR) is analyzed. And mel frequency cepstrum coefficients (MFCCs) and the feature vectors through the linear discriminant analysis (LDA) transformation of the filter bank energies (FBE) are implemented. This paper shows that the FBE LDA-based GMM is more distinct method for the pathological/normal voice classification than the MFCC-based GMM.

  • PDF

Dual-Domain Connection Scheme for HE-AAC and MPEG Surround

  • Pang, Hee-Suk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.1E
    • /
    • pp.29-34
    • /
    • 2009
  • MPEG4 High Efficiency Advanced Audio Coding (HE-AAC) and MPEG Surround are one of the most efficient combinations for low bit rate multi-channel audio coding. Based on the fact that these two codecs have identical quadrature mirror filter (QMF) analysis and synthesis structures, we propose a dual-domain connection scheme for the codecs. Specifically two time-domain connection methods are analyzed and compared to the QMF subband-domain connection method. Experimental results show that both the time-domain connection methods cause no subjective sound quality degradation compared to the QMF subband-domain connection method, which verifies that one can select either of them depending on application scenarios.

Performance Analysis of Watermarking using Audio and Image Watermark in Wireless Channel Environment (무선 전송 채널 환경에서 오디오와 로고 영상을 이용한 워터마킹 성능분석)

  • Kim, Yoon-Ho;Park, Ki-Hong
    • Journal of Advanced Navigation Technology
    • /
    • v.10 no.4
    • /
    • pp.406-412
    • /
    • 2006
  • In this paper, we analyzed the performance of digital watermarking by using audio signal as well as logo image watermark. By utilizing the OFDM/QPSK system under AWGN channel environment, watermarked image are transmitted and detected. Experimental results showed that audio signal-based watermark embedding scheme is superior to that of logo image-based, which is able to restore a signal at SNR=3[dB].

  • PDF