• Title/Summary/Keyword: computing matching statistics

Search Result 5, Processing Time 0.015 seconds

Appearance-Order-Based Schema Matching

  • Ding, Guohui;Cao, Keyan;Wang, Guoren;Han, Dong
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.2
    • /
    • pp.94-106
    • /
    • 2014
  • Schema matching is widely used in many applications, such as data integration, ontology merging, data warehouse and dataspaces. In this paper, we propose a novel matching technique that is based on the order of attributes appearing in the schema structure of query results. The appearance order embodies the extent of the importance of an attribute for the user examining the query results. The core idea of our approach is to collect statistics about the appearance order of attributes from the query logs, to find correspondences between attributes in the schemas to be matched. As a first step, we employ a matrix to structure the statistics around the appearance order of attributes. Then, two scoring functions are considered to measure the similarity of the collected statistics. Finally, a traditional algorithm is employed to find the mapping with the highest score. Furthermore, our approach can be seen as a complementary member to the family of the existing matchers, and can also be combined with them to obtain more accurate results. We validate our approach with an experimental study, the results of which demonstrate that our approach is effective, and has good performance.

Efficient Construction of Generalized Suffix Arrays by Merging Suffix Arrays (써픽스 배열 합병을 이용한 일반화된 써픽스 배열의 효율적인 구축 알고리즘)

  • Jeon, Jeong-Eun;Park, Heejin;Kim, Dong-Kyue
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.6
    • /
    • pp.268-278
    • /
    • 2005
  • We consider constructing the generalized suffix way of strings A and B when the suffix arrays of A and B are given, j.e., merging two suffix arrays of A and B. There are efficient algorithms to merge some special suffix arrays such as the odd array and the even array. However, for the general case that A and B are arbitrary strings, no efficient merging algorithms have been developed. Thus, one had to construct the generalized suffix arrays of A and B by constructing the suffix array of A$\#$B$\$$ from scratch, even though the suffix ways of A and B are given. In this paper, we Present efficient merging algorithms for the suffix arrays of two arbitrary strings A and B drawn from constant and integer alphabets. The experimental results show that merging two suffix ways of A and B are about 5 times faster than constructing the suffix way of A$\#$B$\$$ from scratch for constant alphabets. Our algorithms include searching all suffixes of string B in the suffix array of A. To do this, we use suffix links in suffix ways and we developed efficient algorithms for computing the suffix links. Efficient computation of suffix links is another contribution of this paper because it can be used to solve other problems occurred in bioinformatics that should search all suffixes of a given string in the suffix array of another string such as computing matching statistics, finding longest common substrings, and so on. The experimental results show that our methods for computing suffix links is about 3-4 times faster than the previous fastest methods.

Motion Estimation in Video Coding using Search Candidate Point on Region by Binary-Tree Structure (이진트리 구조에 따른 구간별 탐색 후보점을 이용한 비디오 코딩의 움직임 추정)

  • Kwak, Sung-Keun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.1
    • /
    • pp.402-410
    • /
    • 2013
  • In this paper, we propose a new fast block matching algorithm for block matching using the temporal and spatially correlation of the video sequence and local statistics of neighboring motion vectors. Since the temporal correlation of the video sequence between the motion vector of current block and the motion vector of previous block. The proposed algorithm determines the location of a better starting point for the search of an exact motion vector using the point of the smallest SAD(sum of absolute difference) value by the predicted motion vectors of neighboring blocks around the same block of the previous frame and the current frame and the predictor candidate point on each division region by binary-tree structure. Experimental results show that the proposed algorithm has the capability to dramatically reduce the search points and computing cost for motion estimation, comparing to fast FS(full search) motion estimation and other fast motion estimation.

A Design and Implementation of Music & Image Retrieval Recommendation System based on Emotion (감성기반 음악.이미지 검색 추천 시스템 설계 및 구현)

  • Kim, Tae-Yeun;Song, Byoung-Ho;Bae, Sang-Hyun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.73-79
    • /
    • 2010
  • Emotion intelligence computing is able to processing of human emotion through it's studying and adaptation. Also, Be able more efficient to interaction of human and computer. As sight and hearing, music & image is constitute of short time and continue for long. Cause to success marketing, understand-translate of humanity emotion. In this paper, Be design of check system that matched music and image by user emotion keyword(irritability, gloom, calmness, joy). Suggested system is definition by 4 stage situations. Then, Using music & image and emotion ontology to retrieval normalized music & image. Also, A sampling of image peculiarity information and similarity measurement is able to get wanted result. At the same time, Matched on one space through pared correspondence analysis and factor analysis for classify image emotion recognition information. Experimentation findings, Suggest system was show 82.4% matching rate about 4 stage emotion condition.

Direct Divergence Approximation between Probability Distributions and Its Applications in Machine Learning

  • Sugiyama, Masashi;Liu, Song;du Plessis, Marthinus Christoffel;Yamanaka, Masao;Yamada, Makoto;Suzuki, Taiji;Kanamori, Takafumi
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.2
    • /
    • pp.99-111
    • /
    • 2013
  • Approximating a divergence between two probability distributions from their samples is a fundamental challenge in statistics, information theory, and machine learning. A divergence approximator can be used for various purposes, such as two-sample homogeneity testing, change-point detection, and class-balance estimation. Furthermore, an approximator of a divergence between the joint distribution and the product of marginals can be used for independence testing, which has a wide range of applications, including feature selection and extraction, clustering, object matching, independent component analysis, and causal direction estimation. In this paper, we review recent advances in divergence approximation. Our emphasis is that directly approximating the divergence without estimating probability distributions is more sensible than a naive two-step approach of first estimating probability distributions and then approximating the divergence. Furthermore, despite the overwhelming popularity of the Kullback-Leibler divergence as a divergence measure, we argue that alternatives such as the Pearson divergence, the relative Pearson divergence, and the $L^2$-distance are more useful in practice because of their computationally efficient approximability, high numerical stability, and superior robustness against outliers.