• Title/Summary/Keyword: Pattern Similarity Retrieval

Search Result 26, Processing Time 0.022 seconds

A Study on the Musical Theme Clustering for Searching Note Sequences (음렬 탐색을 위한 주제소절 자동분류에 관한 연구)

  • 심지영;김태수
    • Journal of the Korean Society for information Management
    • /
    • v.19 no.3
    • /
    • pp.5-30
    • /
    • 2002
  • In this paper, classification feature is selected with focus of musical content, note sequences pattern, and measures similarity between note sequences followed by constructing clusters by similar note sequences, which is easier for users to search by showing the similar note sequences with the search result in the CBMR system. Experimental document was $\ulcorner$A Dictionary of Musical Themes$\lrcorner$, the index of theme bar focused on classical music and obtained kern-type file. Humdrum Toolkit version 1.0 was used as note sequences treat tool. The hierarchical clustering method is by stages focused on four-type similarity matrices by whether the note sequences segmentation or not and where the starting point is. For the measurement of the result, WACS standard is used in the case of being manual classification and in the case of the note sequences starling from any point in the note sequences, there is used common feature pattern distribution in the cluster obtained from the clustering result. According to the result, clustering with segmented feature unconnected with the starting point Is higher with distinct difference compared with clustering with non-segmented feature.

QP-DTW: Upgrading Dynamic Time Warping to Handle Quasi Periodic Time Series Alignment

  • Boulnemour, Imen;Boucheham, Bachir
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.851-876
    • /
    • 2018
  • Dynamic time warping (DTW) is the main algorithms for time series alignment. However, it is unsuitable for quasi-periodic time series. In the current situation, except the recently published the shape exchange algorithm (SEA) method and its derivatives, no other technique is able to handle alignment of this type of very complex time series. In this work, we propose a novel algorithm that combines the advantages of the SEA and the DTW methods. Our main contribution consists in the elevation of the DTW power of alignment from the lowest level (Class A, non-periodic time series) to the highest level (Class C, multiple-periods time series containing different number of periods each), according to the recent classification of time series alignment methods proposed by Boucheham (Int J Mach Learn Cybern, vol. 4, no. 5, pp. 537-550, 2013). The new method (quasi-periodic dynamic time warping [QP-DTW]) was compared to both SEA and DTW methods on electrocardiogram (ECG) time series, selected from the Massachusetts Institute of Technology - Beth Israel Hospital (MIT-BIH) public database and from the PTB Diagnostic ECG Database. Results show that the proposed algorithm is more effective than DTW and SEA in terms of alignment accuracy on both qualitative and quantitative levels. Therefore, QP-DTW would potentially be more suitable for many applications related to time series (e.g., data mining, pattern recognition, search/retrieval, motif discovery, classification, etc.).

Similarity checking between XML tags through expanding synonym vector (유사어 벡터 확장을 통한 XML태그의 유사성 검사)

  • Lee, Jung-Won;Lee, Hye-Soo;Lee, Ki-Ho
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.9
    • /
    • pp.676-683
    • /
    • 2002
  • The success of XML(eXtensible Markup Language) is primarily based on its flexibility : everybody can define the structure of XML documents that represent information in the form he or she desires. XML is so flexible that XML documents cannot be automatically provided with an underlying semantics. Different tag sets, different names for elements or attributes, or different document structures in general mislead the task of classifying and clustering XML documents precisely. In this paper, we design and implement a system that allows checking the semantic-based similarity between XML tags. First, this system extracts the underlying semantics of tags and then expands the synonym set of tags using an WordNet thesaurus and user-defined word library which supports the abbreviation forms and compound words for XML tags. Seconds, considering the relative importance of XML tags in the XML documents, we extend a conventional vector space model which is the most generally used for document model in Information Retrieval field. Using this method, we have been able to check the similarity between XML tags which are represented different tags.

Discovery Methods of Similar Web Service Operations by Learning Ontologies (온톨로지 학습에 의한 유사 웹 서비스 오퍼레이션 발견 방법)

  • Lee, Yong-Ju
    • The KIPS Transactions:PartD
    • /
    • v.18D no.2
    • /
    • pp.133-142
    • /
    • 2011
  • To ensure the successful employment of semantic web services, it is essential that they rely on the use of high quality ontologies. However, building such ontologies is difficult and costly, thus hampering web service deployment. This study automatically builds ontologies from WSDL documents and their underlying semantics, and presents discovery methods of similar web service operations using these ontologies. The key ingredient is techniques that cluster parameters in the collection of web services into semantically meaningful concepts, and capture the hierarchical relationships between the words contained in the tag. We implement an operation retrieval system for web services. This system finds out a ranked set of similar operations using a novel similarity measurement method, and selects the most optimal operation which satisfies user's requirements. It can be directly used for the web services composition.

Story-based Information Retrieval (스토리 기반의 정보 검색 연구)

  • You, Eun-Soon;Park, Seung-Bo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.81-96
    • /
    • 2013
  • Video information retrieval has become a very important issue because of the explosive increase in video data from Web content development. Meanwhile, content-based video analysis using visual features has been the main source for video information retrieval and browsing. Content in video can be represented with content-based analysis techniques, which can extract various features from audio-visual data such as frames, shots, colors, texture, or shape. Moreover, similarity between videos can be measured through content-based analysis. However, a movie that is one of typical types of video data is organized by story as well as audio-visual data. This causes a semantic gap between significant information recognized by people and information resulting from content-based analysis, when content-based video analysis using only audio-visual data of low level is applied to information retrieval of movie. The reason for this semantic gap is that the story line for a movie is high level information, with relationships in the content that changes as the movie progresses. Information retrieval related to the story line of a movie cannot be executed by only content-based analysis techniques. A formal model is needed, which can determine relationships among movie contents, or track meaning changes, in order to accurately retrieve the story information. Recently, story-based video analysis techniques have emerged using a social network concept for story information retrieval. These approaches represent a story by using the relationships between characters in a movie, but these approaches have problems. First, they do not express dynamic changes in relationships between characters according to story development. Second, they miss profound information, such as emotions indicating the identities and psychological states of the characters. Emotion is essential to understanding a character's motivation, conflict, and resolution. Third, they do not take account of events and background that contribute to the story. As a result, this paper reviews the importance and weaknesses of previous video analysis methods ranging from content-based approaches to story analysis based on social network. Also, we suggest necessary elements, such as character, background, and events, based on narrative structures introduced in the literature. We extract characters' emotional words from the script of the movie Pretty Woman by using the hierarchical attribute of WordNet, which is an extensive English thesaurus. WordNet offers relationships between words (e.g., synonyms, hypernyms, hyponyms, antonyms). We present a method to visualize the emotional pattern of a character over time. Second, a character's inner nature must be predetermined in order to model a character arc that can depict the character's growth and development. To this end, we analyze the amount of the character's dialogue in the script and track the character's inner nature using social network concepts, such as in-degree (incoming links) and out-degree (outgoing links). Additionally, we propose a method that can track a character's inner nature by tracing indices such as degree, in-degree, and out-degree of the character network in a movie through its progression. Finally, the spatial background where characters meet and where events take place is an important element in the story. We take advantage of the movie script to extracting significant spatial background and suggest a scene map describing spatial arrangements and distances in the movie. Important places where main characters first meet or where they stay during long periods of time can be extracted through this scene map. In view of the aforementioned three elements (character, event, background), we extract a variety of information related to the story and evaluate the performance of the proposed method. We can track story information extracted over time and detect a change in the character's emotion or inner nature, spatial movement, and conflicts and resolutions in the story.

Vector Approximation Bitmap Indexing Method for High Dimensional Multimedia Database (고차원 멀티미디어 데이터 검색을 위한 벡터 근사 비트맵 색인 방법)

  • Park Joo-Hyoun;Son Dea-On;Nang Jong-Ho;Joo Bok-Gyu
    • The KIPS Transactions:PartD
    • /
    • v.13D no.4 s.107
    • /
    • pp.455-462
    • /
    • 2006
  • Recently, the filtering approach using vector approximation such as VA-file[1] or LPC-file[2] have been proposed to support similarity search in high dimensional data space. This approach filters out many irrelevant vectors by calculating the approximate distance from a query vector using the compact approximations of vectors in database. Accordingly, the total elapsed time for similarity search is reduced because the disk I/O time is eliminated by reading the compact approximations instead of original vectors. However, the search time of the VA-file or LPC-file is not much lessened compared to the brute-force search because it requires a lot of computations for calculating the approximate distance. This paper proposes a new bitmap index structure in order to minimize the calculating time. To improve the calculating speed, a specific value of an object is saved in a bit pattern that shows a spatial position of the feature vector on a data space, and the calculation for a distance between objects is performed by the XOR bit calculation that is much faster than the real vector calculation. According to the experiment, the method that this paper suggests has shortened the total searching time to the extent of about one fourth of the sequential searching time, and to the utmost two times of the existing methods by shortening the great deal of calculating time, although this method has a longer data reading time compared to the existing vector approximation based approach. Consequently, it can be confirmed that we can improve even more the searching performance by shortening the calculating time for filtering of the existing vector approximation methods when the database speed is fast enough.