• Title/Summary/Keyword: Similarity Measures

Search Result 304, Processing Time 0.025 seconds

Image Denoising via Fast and Fuzzy Non-local Means Algorithm

  • Lv, Junrui;Luo, Xuegang
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1108-1118
    • /
    • 2019
  • Non-local means (NLM) algorithm is an effective and successful denoising method, but it is computationally heavy. To deal with this obstacle, we propose a novel NLM algorithm with fuzzy metric (FM-NLM) for image denoising in this paper. A new feature metric of visual features with fuzzy metric is utilized to measure the similarity between image pixels in the presence of Gaussian noise. Similarity measures of luminance and structure information are calculated using a fuzzy metric. A smooth kernel is constructed with the proposed fuzzy metric instead of the Gaussian weighted L2 norm kernel. The fuzzy metric and smooth kernel computationally simplify the NLM algorithm and avoid the filter parameters. Meanwhile, the proposed FM-NLM using visual structure preferably preserves the original undistorted image structures. The performance of the improved method is visually and quantitatively comparable with or better than that of the current state-of-the-art NLM-based denoising algorithms.

Parametric and Non Parametric Measures for Text Similarity (텍스트 유사성을 위한 파라미터 및 비 파라미터 측정)

  • Mlyahilu, John;Kim, Jong-Nam
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.20 no.4
    • /
    • pp.193-198
    • /
    • 2019
  • The wide spread of genuine and fake information on internet has lead to various studies on text analysis. Copying and pasting others' work without acknowledgement, research results manipulation without proof has been trending for a while in the era of data science. Various tools have been developed to reduce, combat and possibly eradicate plagiarism in various research fields. Text similarity measurements can be manually done by using both parametric and non parametric methods of which this study implements cosine similarity and Pearson correlation as parametric while Spearman correlation as non parametric. Cosine similarity and Pearson correlation metrics have achieved highest coefficients of similarity while Spearman shown low similarity coefficients. We recommend the use of non parametric methods in measuring text similarity due to their non normality assumption as opposed to the parametric methods which relies on normality assumptions and biasness.

Improving The Performance of Triple Generation Based on Distant Supervision By Using Semantic Similarity (의미 유사도를 활용한 Distant Supervision 기반의 트리플 생성 성능 향상)

  • Yoon, Hee-Geun;Choi, Su Jeong;Park, Seong-Bae
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.653-661
    • /
    • 2016
  • The existing pattern-based triple generation systems based on distant supervision could be flawed by assumption of distant supervision. For resolving flaw from an excessive assumption, statistics information has been commonly used for measuring confidence of patterns in previous studies. In this study, we proposed a more accurate confidence measure based on semantic similarity between patterns and properties. Unsupervised learning method, word embedding and WordNet-based similarity measures were adopted for learning meaning of words and measuring semantic similarity. For resolving language discordance between patterns and properties, we adopted CCA for aligning bilingual word embedding models and a translation-based approach for a WordNet-based measure. The results of our experiments indicated that the accuracy of triples that are filtered by the semantic similarity-based confidence measure was 16% higher than that of the statistics-based approach. These results suggested that semantic similarity-based confidence measure is more effective than statistics-based approach for generating high quality triples.

Clustering-based Statistical Machine Translation Using Syntactic Structure and Word Similarity (문장구조 유사도와 단어 유사도를 이용한 클러스터링 기반의 통계기계번역)

  • Kim, Han-Kyong;Na, Hwi-Dong;Li, Jin-Ji;Lee, Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.4
    • /
    • pp.297-304
    • /
    • 2010
  • Clustering method which based on sentence type or document genre is a technique used to improve translation quality of SMT(statistical machine translation) by domain-specific translation. But there is no previous research using sentence type and document genre information simultaneously. In this paper, we suggest an integrated clustering method that classifying sentence type by syntactic structure similarity and document genre by word similarity information. We interpolated domain-specific models from clusters with general models to improve translation quality of SMT system. Kernel function and cosine measures are applied to calculate structural similarity and word similarity. With these similarities, we used machine learning algorithms similar to K-means to clustering. In Japanese-English patent translation corpus, we got 2.5% point relative improvements of translation quality at optimal case.

An Artificial Intelligence Approach for Word Semantic Similarity Measure of Hindi Language

  • Younas, Farah;Nadir, Jumana;Usman, Muhammad;Khan, Muhammad Attique;Khan, Sajid Ali;Kadry, Seifedine;Nam, Yunyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.2049-2068
    • /
    • 2021
  • AI combined with NLP techniques has promoted the use of Virtual Assistants and have made people rely on them for many diverse uses. Conversational Agents are the most promising technique that assists computer users through their operation. An important challenge in developing Conversational Agents globally is transferring the groundbreaking expertise obtained in English to other languages. AI is making it possible to transfer this learning. There is a dire need to develop systems that understand secular languages. One such difficult language is Hindi, which is the fourth most spoken language in the world. Semantic similarity is an important part of Natural Language Processing, which involves applications such as ontology learning and information extraction, for developing conversational agents. Most of the research is concentrated on English and other European languages. This paper presents a Corpus-based word semantic similarity measure for Hindi. An experiment involving the translation of the English benchmark dataset to Hindi is performed, investigating the incorporation of the corpus, with human and machine similarity ratings. A significant correlation to the human intuition and the algorithm ratings has been calculated for analyzing the accuracy of the proposed similarity measures. The method can be adapted in various applications of word semantic similarity or module for any other language.

A Strategy for Neighborhood Selection in Collaborative Filtering-based Recommender Systems (협력 필터링 기반의 추천 시스템을 위한 이웃 선정 전략)

  • Lee, Soojung
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1380-1385
    • /
    • 2015
  • Collaborative filtering is one of the most successfully used methods for recommender systems and has been utilized in various areas such as books and music. The key point of this method is selecting the most proper recommenders, for which various similarity measures have been studied. To improve recommendation performance, this study analyzes problems of existing recommender selection methods based on similarity and presents a method of dynamically determining recommenders based on the rate of co-rated items as well as similarity. Examination of performance with varying thresholds through experiments revealed that the proposed method yielded greatly improved results in both prediction and recommendation qualities, and that in particular, this method showed performance improvements with only a few recommenders satisfying the given thresholds.

Determination of Object Similarity Closure Using Shared Neighborhood Connectivity

  • Radhakrishnan, Palanikumar;Arokiasamy, Clementking
    • Journal of the Korea Convergence Society
    • /
    • v.5 no.3
    • /
    • pp.41-44
    • /
    • 2014
  • Sequential object analysis are playing vital role in real time application in computer vision and object detections.Measuring the similarity in two images are very important issue any authentication activities with how best to compare two independent images. Identification of similarities of two or more sequential images is also the important in respect to moving of neighborhoods pixels. In our study we introduce the morphological and shared near neighborhoods concept which produces a sufficient results of comparing the two images with objects. Considering the each pixel compare with 8-connectivity pixels of second image. For consider the pixels we expect the noise removed images are to be considered, so we apply the morphological transformations such as opening, closing with erosion and dilations. RGB of pixel values are compared for the two sequential images if it is similar we include the pixels in the resultant image otherwise ignore the pixels. All un-similar pixels are identified and ignored which produces the similarity of two independent images. The results are produced from the images with objects and gray levels. It produces the expected results from our process.

A Study on Preprocessing Method for Effective Semantic-based Similarity Measures using Approximate Matching Algorithm (의미적 유사성의 효과적 탐지를 위한 데이터 전처리 연구)

  • Kang, Hari;Jeong, Doowon;Lee, Sangjin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.3
    • /
    • pp.595-602
    • /
    • 2015
  • One of the challenges of the digital forensics is how to handle certain amounts of data efficiently. Although reliable and various approximate matching algorithms have been presented to quickly identify similarities between digital objects, its practical effectiveness to identify the semantic similarity is low because of frequent false positives. To solve this problem, we suggest adding a pre-processing of the approximate matching target dataset to increase matching accuracy while maintaining the reliability of the approximate matching algorithm. To verify the effectiveness, we experimented with two datasets of eml and hwp using sdhash in order to identify the semantic similarity.

The analysis of relationships between facial impressions and physical features (얼굴 인상과 물리적 특징의 관계 구조 분석)

  • 김효선;한재현
    • Korean Journal of Cognitive Science
    • /
    • v.14 no.4
    • /
    • pp.53-63
    • /
    • 2003
  • We analyzed the relationships between facial impressions and physical features, and investigated the effects of impressions on facial similarity judgments. Using 79 faces extracted from a face database, we collected the ratings of impressions along four dimensions -mild-fierce, bright-dull, feminine-manly and youthful-mature- and the measures of 41 physical features. Multiple Regression Analyses showed that the ratings of impressions and the measures of features are closely connected with each other. Our experiments using facial similarity judgments confirmed the possibility that facial impressions are used in processing of facial information. We found that people tend to perceive faces as similar when they have the same impressions rather than neutral ones, although all of them are alike physically. These results imply that facial impressions are used as a psychological structure representing facial appearance, and that facial processing includes impression information.

  • PDF

Content based Video Segmentation Algorithm using Comparison of Pattern Similarity (장면의 유사도 패턴 비교를 이용한 내용기반 동영상 분할 알고리즘)

  • Won, In-Su;Cho, Ju-Hee;Na, Sang-Il;Jin, Ju-Kyong;Jeong, Jae-Hyup;Jeong, Dong-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.10
    • /
    • pp.1252-1261
    • /
    • 2011
  • In this paper, we propose the comparison method of pattern similarity for video segmentation algorithm. The shot boundary type is categorized as 2 types, abrupt change and gradual change. The representative examples of gradual change are dissolve, fade-in, fade-out or wipe transition. The proposed method consider the problem to detect shot boundary as 2-class problem. We concentrated if the shot boundary event happens or not. It is essential to define similarity between frames for shot boundary detection. We proposed 2 similarity measures, within similarity and between similarity. The within similarity is defined by feature comparison between frames belong to same shot. The between similarity is defined by feature comparison between frames belong to different scene. Finally we calculated the statistical patterns comparison between the within similarity and between similarity. Because this measure is robust to flash light or object movement, our proposed algorithm make contribution towards reducing false positive rate. We employed color histogram and mean of sub-block on frame image as frame feature. We performed the experimental evaluation with video dataset including set of TREC-2001 and TREC-2002. The proposed algorithm shows the performance, 91.84% recall and 86.43% precision in experimental circumstance.