• Title/Summary/Keyword: Audio retrieval

Search Result 102, Processing Time 0.025 seconds

Retrieval of Broadcast News Using Audio Content Analysis

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.3E
    • /
    • pp.74-79
    • /
    • 2007
  • In this paper, we report our recent work on a indexing and retrieval system of broadcast news using audio content analysis. Key issues addressed in this work are two major parts of the audio indexing system: anchorperson detection based on audio segmentation, and phone-based spoken document retrieval, developed in the framework of the emerging MPEG-7 standard. Experiments are conducted on a database of Britisch broadcast news videos. We discuss the development of the retrieval system, and the evaluation of each part and the retrieval system.

Audio Fingerprint Retrieval Method Based on Feature Dimension Reduction and Feature Combination

  • Zhang, Qiu-yu;Xu, Fu-jiu;Bai, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.2
    • /
    • pp.522-539
    • /
    • 2021
  • In order to solve the problems of the existing audio fingerprint method when extracting audio fingerprints from long speech segments, such as too large fingerprint dimension, poor robustness, and low retrieval accuracy and efficiency, a robust audio fingerprint retrieval method based on feature dimension reduction and feature combination is proposed. Firstly, the Mel-frequency cepstral coefficient (MFCC) and linear prediction cepstrum coefficient (LPCC) of the original speech are extracted respectively, and the MFCC feature matrix and LPCC feature matrix are combined. Secondly, the feature dimension reduction method based on information entropy is used for column dimension reduction, and the feature matrix after dimension reduction is used for row dimension reduction based on energy feature dimension reduction method. Finally, the audio fingerprint is constructed by using the feature combination matrix after dimension reduction. When speech's user retrieval, the normalized Hamming distance algorithm is used for matching retrieval. Experiment results show that the proposed method has smaller audio fingerprint dimension and better robustness for long speech segments, and has higher retrieval efficiency while maintaining a higher recall rate and precision rate.

Implementation of an Efficient Wavelet Based Audio Data Retrieval System (효율적인 웨이블렛 기반 오디오 데이터 검색 시스템 구현)

  • 이배호;조용춘;김광희
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.1
    • /
    • pp.82-88
    • /
    • 2002
  • In this paper, we proposed a audio indexing method that is used wavelet transform for audio data retrieval. It is difficult for audio data to make a efficient audio data index because of its own particular properties, such as requirement of large storage, real time to transfer and wide bandwidth. An audio data in del using wavelet transform make it possible to index and retrieval by using the particular wavelet transform properties. Our proposed indexing method doesn't separate data to several blocks. Therefore we use both high-pass and low-pass parts of last level coefficient of wavelet transform. Audio data indexing is made by applying the string matching algorithm to high-pass part and zero-crossing histogram to low-pass part. These are transformed to the continued strings, Through this method, we described a retrieval efficiency. The retrieval method is done by comparing the database index string to the query string and then data of minimum values is chosen to the result. Our simulation decided proper comparative coefficient and made known changing of retrieval efficiency versus audio data length. The results show that the proposed method improves retrieval efficiency compared to conventional method.

An Efficient Audio Indexing Scheme based on User Query Patterns (사용자 질의 패턴을 이용한 효율적인 오디오 색인기법)

  • 노승민;박동문;황인준
    • Journal of KIISE:Databases
    • /
    • v.31 no.4
    • /
    • pp.341-351
    • /
    • 2004
  • With the popularity of digital audio contents, querying and retrieving audio contents efficiently from database has become essential. In this paper, we propose a new index scheme for retrieving audio contents efficiently using audio portions that have been queried frequently. This scheme is based on the observation that users have a tendency to memorize and query a small number of audio portions. Detecting and indexing such portions enables fast retrieval and shows better performance than sequential search-based audio retrieval. Moreover, this scheme is independent of underlying retrieval system, which means this scheme can work together with any other audio retrieval system. We have implemented a prototype system and showed its performance gain through experiments.

A Content-based Audio Retrieval System Supporting Efficient Expansion of Audio Database (음원 데이터베이스의 효율적 확장을 지원하는 내용 기반 음원 검색 시스템)

  • Park, Ji Hun;Kang, Hyunchul
    • Journal of Digital Contents Society
    • /
    • v.18 no.5
    • /
    • pp.811-820
    • /
    • 2017
  • For content-based audio retrieval which is one of main functions in audio service, the techniques for extracting fingerprints from the audio source, storing and indexing them in a database are widely used. However, if the fingerprints of new audio sources are continually inserted into the database, there is a problem that space efficiency as well as audio retrieval performance are gradually deteriorated. Therefore, there is a need for techniques to support efficient expansion of audio database without periodic reorganization of the database that would increase the system operation cost. In this paper, we design a content-based audio retrieval system that solves this problem by using MapReduce and NoSQL database in a cluster computing environment based on the Shazam's fingerprinting algorithm, and evaluate its performance through a detailed set of experiments using real world audio data.

Pretreatment For The Problem Solution Of Contents-Based Music Retrieval (내용 기반 음악 검색의 문제점 해결을 위한 전처리)

  • Chung, Myoung-Beom;Sung, Bo-Kyung;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.6
    • /
    • pp.97-104
    • /
    • 2007
  • This paper presents the problem of the feature extraction techniques that has been used a content-based analysis, classification and retrieval in audio data and proposes a course of the preprocessing for a new contents-based retrieval methods. Because the feature vector according to sampling value changes, the existing audio data analysis is problem that same music is appraised by other music. Therefore, we propose waveform information extraction method of PCM data for retrieval audio data of various format to contents-based. If this method is used. we can find that audio datas that get into sampling in various format are same data. And it may be applied in contents-based music retrieval system. To verity the performance of the method, an experiment was done feature extraction using STFT and waveform information extraction using PCM data. As a result, we could know that the method to propose is effective more.

  • PDF

A Study on Improvement of Retrieval Algorithm for Audio Response Service (음성정보 서비스의 검색 알고리즘 개선 연구)

  • Jeong, Yoo-Hyeon;Kim, Soon-Hyop
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.5
    • /
    • pp.92-95
    • /
    • 1997
  • Thlephone pushbuttons simply consist of 0~9 digits, #, and ${\ast}$). So it is difficulty for user to input the various query command for information retrieval of audio response sevice. We suggest the new retrieval algorithm for audio response service using Korean initial sounds sequences. User those who do not know the retrieval code can retrieve the audio response service by pushing the telephone digit buttons which correspond to initial sounds of its name.

  • PDF

Representative Melodies Retrieval using Waveform and FFT Analysis of Audio (오디오의 파형과 FFT 분석을 이용한 대표 선율 검색)

  • Chung, Myoung-Bum;Ko, Il-Ju
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.12
    • /
    • pp.1037-1044
    • /
    • 2007
  • Recently, we extract the representative melody of the music and index the music to reduce searching time at the content-based music retrieval system. The existing study has used MIDI data to extract a representative melody but it has a weak point that can use only MIDI data. Therefore, this paper proposes a representative melody retrieval method that can be use at all audio file format and uses digital signal processing. First, we use Fast Fourier Transform (FFT) and find the tempo and node for the representative melody retrieval. And we measure the frequency of high value that appears from PCM Data of each node. The point which the high value is gathering most is the starting point of a representative melody and an eight node from the starting point is a representative melody section of the audio data. To verity the performance of the method, we chose a thousand of the song and did the experiment to extract a representative melody from the song. In result, the accuracy of the extractive representative melody was 79.5% among the 737 songs which was found tempo.

Retrieval of Player Event in Golf Videos Using Spoken Content Analysis (음성정보 내용분석을 통한 골프 동영상에서의 선수별 이벤트 구간 검색)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.674-679
    • /
    • 2009
  • This paper proposes a method of player event retrieval using combination of two functions: detection of player name in speech information and detection of sound event from audio information in golf videos. The system consists of indexing module and retrieval module. At the indexing time audio segmentation and noise reduction are applied to audio stream demultiplexed from the golf videos. The noise-reduced speech is then fed into speech recognizer, which outputs spoken descriptors. The player name and sound event are indexed by the spoken descriptors. At search time, text query is converted into phoneme sequences. The lists of each query term are retrieved through a description matcher to identify full and partial phrase hits. For the retrieval of the player name, this paper compares the results of word-based, phoneme-based, and hybrid approach.

Musician Search in Time-Series Pattern Index Files using Features of Audio (오디오 특징계수를 이용한 시계열 패턴 인덱스 화일의 뮤지션 검색 기법)

  • Kim, Young-In
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.69-74
    • /
    • 2006
  • The recent development of multimedia content-based retrieval technologies brings great attention of musician retrieval using features of a digital audio data among music information retrieval technologies. But the indexing techniques for music databases have not been studied completely. In this paper, we present a musician retrieval technique for audio features using the space split methods in the time-series pattern index file. We use features of audio to retrieve the musician and a time-series pattern index file to search the candidate musicians. Experimental results show that the time-series pattern index file using the rotational split method is efficient for musician retrievals in the time-series pattern files.

  • PDF