• 제목/요약/키워드: sequence retrieval

검색결과 108건 처리시간 0.017초

PVDHMM을 이용한 음소열 기반의 SDR 응용 (Spoken Document Retrieval Based on Phone Sequence Strings Decoded by PVDHMM)

  • 최대림;김봉완;김종교;이용주
    • 대한음성학회지:말소리
    • /
    • 제62호
    • /
    • pp.133-147
    • /
    • 2007
  • In this paper, we introduce a phone vector discrete HMM(PVDHMM) that decodes a phone sequence string, and demonstrates the applicability to spoken document retrieval. The PVDHMM treats a phone recognizer or large vocabulary continuous speech recognizer (LVCSR) as a vector quantizer whose codebook size is equal to the size of its phone set. We apply the PVDHMM to decode the phone sequence strings and compare the outputs with those of a continuous speech recognizer(CSR). Also we carry out spoken document retrieval experiment through PVDHMM word spotter on the phone sequence strings which are generated by phone recognizer or LVCSR and compare its results with those of retrieval through the phone-based vector space model.

  • PDF

An Efficient Video Retrieval Algorithm Using Key Frame Matching for Video Content Management

  • Kim, Sang Hyun
    • International Journal of Contents
    • /
    • 제12권1호
    • /
    • pp.1-5
    • /
    • 2016
  • To manipulate large video contents, effective video indexing and retrieval are required. A large number of video indexing and retrieval algorithms have been presented for frame-wise user query or video content query whereas a relatively few video sequence matching algorithms have been proposed for video sequence query. In this paper, we propose an efficient algorithm that extracts key frames using color histograms and matches the video sequences using edge features. To effectively match video sequences with a low computational load, we make use of the key frames extracted by the cumulative measure and the distance between key frames, and compare two sets of key frames using the modified Hausdorff distance. Experimental results with real sequence show that the proposed video sequence matching algorithm using edge features yields the higher accuracy and performance than conventional methods such as histogram difference, Euclidean metric, Battachaya distance, and directed divergence methods.

Protein Sequence Search based on N-gram Indexing

  • Hwang, Mi-Nyeong;Kim, Jin-Suk
    • Bioinformatics and Biosystems
    • /
    • 제1권1호
    • /
    • pp.46-50
    • /
    • 2006
  • According to the advancement of experimental techniques in molecular biology, genomic and protein sequence databases are increasing in size exponentially, and mean sequence lengths are also increasing. Because the sizes of these databases become larger, it is difficult to search similar sequences in biological databases with significant homologies to a query sequence. In this paper, we present the N-gram indexing method to retrieve similar sequences fast, precisely and comparably. This method regards a protein sequence as a text written in language of 20 amino acid codes, adapts N-gram tokens of fixed-length as its indexing scheme for sequence strings. After such tokens are indexed for all the sequences in the database, sequences can be searched with information retrieval algorithms. Using this new method, we have developed a protein sequence search system named as ProSeS (PROtein Sequence Search). ProSeS is a protein sequence analysis system which provides overall analysis results such as similar sequences with significant homologies, predicted subcellular locations of the query sequence, and major keywords extracted from annotations of similar sequences. We show experimentally that the N-gram indexing approach saves the retrieval time significantly, and that it is as accurate as current popular search tool BLAST.

  • PDF

An Efficient Video Retrieval Algorithm Using Color and Edge Features

  • Kim Sang-Hyun
    • 융합신호처리학회논문지
    • /
    • 제7권1호
    • /
    • pp.11-16
    • /
    • 2006
  • To manipulate large video databases, effective video indexing and retrieval are required. A large number of video indexing and retrieval algorithms have been presented for frame-w]so user query or video content query whereas a relatively few video sequence matching algorithms have been proposed for video sequence query. In this paper, we propose an efficient algorithm to extract key frames using color histograms and to match the video sequences using edge features. To effectively match video sequences with low computational load, we make use of the key frames extracted by the cumulative measure and the distance between key frames, and compare two sets of key frames using the modified Hausdorff distance. Experimental results with several real sequences show that the proposed video retrieval algorithm using color and edge features yields the higher accuracy and performance than conventional methods such as histogram difference, Euclidean metric, Battachaya distance, and directed divergence methods.

  • PDF

Score Image Retrieval to Inaccurate OMR performance

  • Kim, Haekwang
    • 방송공학회논문지
    • /
    • 제26권7호
    • /
    • pp.838-843
    • /
    • 2021
  • This paper presents an algorithm for effective retrieval of score information to an input score image. The originality of the proposed algorithm is that it is designed to be robust to recognition errors by an OMR (Optical Music Recognition), while existing methods such as pitch histogram requires error induced OMR result be corrected before retrieval process. This approach helps people to retrieve score without training on music score for error correction. OMR takes a score image as input, recognizes musical symbols, and produces structural symbolic notation of the score as output, for example, in MusicXML format. Among the musical symbols on a score, it is observed that filled noteheads are rarely detected with errors with its simple black filled round shape for OMR processing. Barlines that separate measures also strong to OMR errors with its long uniform length vertical line characteristic. The proposed algorithm consists of a descriptor for a score and a similarity measure between a query score and a reference score. The descriptor is based on note-count, the number of filled noteheads in a measure. Each part of a score is represented by a sequence of note-count numbers. The descriptor is an n-gram sequence of the note-count sequence. Simulation results show that the proposed algorithm works successfully to a certain degree in score image-based retrieval for an erroneous OMR output.

시계열 데이타베이스에서 유사한 서브시퀀스의 모양 기반 검색 (Shape-Based Retrieval of Similar Subsequences in Time-Series Databases)

  • 윤지희;김상욱;김태훈;박상현
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제29권5호
    • /
    • pp.381-392
    • /
    • 2002
  • 본 논문에서는 시계열 데이타베이스에서의 모양 기반 검색 문제에 관하여 논의한다. 모양 기반 검색은 실제 요소 값과 관계없이 질의 시퀀스와 유사한 모양을 갖는 (서브)시퀀스를 찾는 연산이다. 본 연구에서는 모양 기반 서브시퀀스 검색을 위한 새로운 기법을 제안한다. 먼저, 시프팅, 스케일링, 이동 평균, 타임 워핑 등 변환들의 다양한 조합을 지원하는 모양 기반 검색을 위하여 새로운 유사 모델을 제시한다. 또한, 이러한 유사 모델을 기반으로 하는 모양 기반 검색을 효과적으로 처리하기 위하여 효율적인 인덱싱 및 질의 처리 기법들을 제안한다. 제안된 기법의 유용성을 규명하기 위하여 실제 데이타인 S&P 500 주식 데이터를 이용한 다양한 실험을 수행한다. 실험 결과에 의하면, 제안된 기법은 질의 시퀀스의 모양과 유사한 모양을 갖는 서브시퀀스들을 성공적으로 검색할 뿐만 아니라 순차 검색 기법과 비교하여 66배까지의 상당한 성능 개선 효과를 갖는 것으로 나타났다.

비디오 검색 시스템을 위한 데이터 시퀀스 패턴 유사성 검색 (Pattern Similarity Retrieval of Data Sequences for Video Retrieval System)

  • 이석룡
    • 정보처리학회논문지D
    • /
    • 제13D권3호
    • /
    • pp.347-356
    • /
    • 2006
  • 비디오 스트림은 다차원 공간에서 데이터 포인트의 시퀀스로 표현될 수 있다. 본 논문에서는 시퀀스 내의 데이터 포인트들의 값들의 근사치에 대한 정보와 시퀀스 내의 포인트들의 방향성에 대한 정보를 내포하고 있는 트랜드 벡터(trend vector)에 대한 소개와 이 벡터를 이용하여 데이터 시퀀스를 위한 유사 패턴 검색 기법을 제안한다. 시퀀스는 복수 개의 세그먼트로 분할되며 각 세그먼트는 트랜드 벡터로 표현된다. 질의처리는 시퀀스 내의 각각의 포인트들에 대하여 수행되는 대신, 트랜드 벡터들에 대하여 처리된다. 제안한 기법은 이 벡터를 사용하여 질의와 무관한 데이터 시퀀스들을 데이터베이스로부터 여과하고 질의 시퀀스와 유사한 시퀀스들을 검색하도록 설계되었다. 제안한 기법을 검증하기 위하여 비디오 스트림과 가상으로 생성된 데이터에 관하여 실험을 수행하였으며, 실험 결과 제안한 기법의 정밀도(precision)는 기존의 방법에 비하여 2.1배까지 향상되었으며 처리시간은 45%까지 감소되었음을 보여주고 있다.

효율적인 유전자 서열 비고를 위한 데이타베이스 검색 모델 (A Database Retrieval Model for Efficient Gene Sequence Alignment)

  • 김민준;임성화;김재훈;이원태;정진원
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제31권3호
    • /
    • pp.243-251
    • /
    • 2004
  • 대부분의 생물정보학의 프로그램들은 데이타베이스로부터 유전자 등의 데이타를 검색하고 처리하여 생화학자와 생물학자에게 서비스를 제공한다. 이때 각각 클라이언트의 요청마다 데이타베이스의 검색을 수행한다면 많은 디스크 접근 시간이 소요된다. 또한 서버에 과부하를 초래하여 응답시간이 길어질 수 있다. 본 논문에서는 생물정보학에서 서열 검색 프로그램의 데이타베이스 사용 패턴을 이용하여 많은 데이타베이스 요청에 대하여 데이타베이스의 검색을 위한 디스크 접근을 공유하는 그룹핑 기법을 제안한다. 또한, 사용자 요청을 대기 시간 없이 처리중인 작업과 동시에 데이타베이스의 검색을 위한 디스크 접근을 공유하여 시스템 처리율을 높이고 빠른 응답시간을 가지는 카플 방식을 제안한다. 제안된 기법은 수학적 분석과 시뮬레이션을 통하여 성능을 검증하였다.

Content similarity matching for video sequence identification

  • Kim, Sang-Hyun
    • International Journal of Contents
    • /
    • 제6권3호
    • /
    • pp.5-9
    • /
    • 2010
  • To manage large database system with video, effective video indexing and retrieval are required. A large number of video retrieval algorithms have been presented for frame-wise user query or video content query, whereas a few video identification algorithms have been proposed for video sequence query. In this paper, we propose an effective video identification algorithm for video sequence query that employs the Cauchy function of histograms between successive frames and the modified Hausdorff distance. To effectively match the video sequences with a low computational load, we make use of the key frames extracted by the cumulative Cauchy function and compare the set of key frames using the modified Hausdorff distance. Experimental results with several color video sequences show that the proposed algorithm for video identification yields remarkably higher performance than conventional algorithms such as Euclidean metric, and directed divergence methods.

Pruning and Matching Scheme for Rotation Invariant Leaf Image Retrieval

  • Tak, Yoon-Sik;Hwang, Een-Jun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제2권6호
    • /
    • pp.280-298
    • /
    • 2008
  • For efficient content-based image retrieval, diverse visual features such as color, texture, and shape have been widely used. In the case of leaf images, further improvement can be achieved based on the following observations. Most plants have unique shape of leaves that consist of one or more blades. Hence, blade-based matching can be more efficient than whole shape-based matching since the number and shape of blades are very effective to filtering out dissimilar leaves. Guaranteeing rotational invariance is critical for matching accuracy. In this paper, we propose a new shape representation, indexing and matching scheme for leaf image retrieval. For leaf shape representation, we generated a distance curve that is a sequence of distances between the leaf’s center and all the contour points. For matching, we developed a blade-based matching algorithm called rotation invariant - partial dynamic time warping (RI-PDTW). To speed up the matching, we suggest two additional techniques: i) priority queue-based pruning of unnecessary blade sequences for rotational invariance, and ii) lower bound-based pruning of unnecessary partial dynamic time warping (PDTW) calculations. We implemented a prototype system on the GEMINI framework [1][2]. Using experimental results, we showed that our scheme achieves excellent performance compared to competitive schemes.