• Title/Summary/Keyword: similarity measure

Search Result 764, Processing Time 0.022 seconds

A New Similarity Measure using Fuzzy Logic for User-based Collaborative Filtering (사용자 기반의 협력필터링을 위한 퍼지 논리를 이용한 새로운 유사도 척도)

  • Lee, Soojung
    • The Journal of Korean Association of Computer Education
    • /
    • v.21 no.5
    • /
    • pp.61-68
    • /
    • 2018
  • Collaborative filtering is a fundamental technique implemented in many commercial recommender systems and provides a successful service to online users. This technique recommends items by referring to other users who have similar rating records to the current user. Hence, similarity measures critically affect the system performance. This study addresses problems of previous similarity measures and suggests a new similarity measure. The proposed measure reflects the subjectivity or vagueness of user ratings and the users' rating behavior by using fuzzy logic. We conduct experimental studies for performance evaluation, whose results show that the proposed measure demonstrates outstanding performance improvements in terms of prediction accuracy and recommendation accuracy.

Similarity Measure Between Interval-valued Vague Sets (구간값 모호집합 사이의 유사척도)

  • Cho, Sang-Yeop
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.5
    • /
    • pp.603-608
    • /
    • 2009
  • In this paper, a similarity measure between interval-valued vague sets is proposed. In the interval-valued vague sets representation, the upper bound and the lower bound of a vague set are represented as intervals of interval-valued fuzzy set respectively. Proposed method combines the concept of geometric distance and the center-of-gravity point of interval-valued vague set to evaluate the degree of similarity between interval-valued vague sets. We also prove three properties of the proposed similarity measure. It provides a useful way to measure the degree of similarity between interval-valued vague sets.

A New Similarity Measure based on Separation of Common Ratings for Collaborative Filtering

  • Lee, Soojung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.11
    • /
    • pp.149-156
    • /
    • 2021
  • Among various implementation techniques of recommender systems, collaborative filtering selects nearest neighbors with high similarity based on past rating history, recommends products preferred by them, and has been successfully utilized by many commercial sites. Accurate estimation of similarity is an important factor that determines performance of the system. Various similarity measures have been developed, which are mostly based on integrating traditional similarity measures and several indices already developed. This study suggests a similarity measure of a novel approach. It separates the common rating area between two users by the magnitude of ratings, estimates similarity for each subarea, and integrates them with weights. This enables identifying similar subareas and reflecting it onto a final similarity value. Performance evaluation using two open datasets is conducted, resulting in that the proposed outperforms the previous one in terms of prediction accuracy, rank accuracy, and mean average precision especially with the dense dataset. The proposed similarity measure is expected to be utilized in various commercial systems for recommending products more suited to user preference.

Using Fuzzy Rating Information for Collaborative Filtering-based Recommender Systems

  • Lee, Soojung
    • International journal of advanced smart convergence
    • /
    • v.9 no.3
    • /
    • pp.42-48
    • /
    • 2020
  • These days people are overwhelmed by information on the Internet thus searching for useful information becomes burdensome, often failing to acquire some in a reasonable time. Recommender systems are indispensable to fulfill such user needs through many practical commercial sites. This study proposes a novel similarity measure for user-based collaborative filtering which is a most popular technique for recommender systems. Compared to existing similarity measures, the main advantages of the suggested measure are that it takes all the ratings given by users into account for computing similarity, thus relieving the inherent data sparsity problem and that it reflects the uncertainty or vagueness of user ratings through fuzzy logic. Performance of the proposed measure is examined by conducting extensive experiments. It is found that it demonstrates superiority over previous relevant measures in terms of major quality metrics.

On some properties of distance measures and fuzzy entropy

  • Lee, Sang-Hyuk;Kim, Sungshin
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.12a
    • /
    • pp.9-12
    • /
    • 2002
  • Representation and quantification of fuzziness are required for the uncertain system modelling and controller design. Conventional results show that entropy of fuzzy sets represent the fuzziness of fuzzy sets. In this literature, the relations of fuzzy enropy, distance measure and similarity measure are discussed, and distance measure is proposed. With the help of relations of fuzzy enropy, distance measure and similarity measure, fuzzy entropy is represented by the newly proposed distance measure. With simple fuzzy set, example is illustrated.

A Study on the Unsupervised Change Detection for Hyperspectral Data Using Similarity Measure Techniques (화소간 유사도 측정 기법을 이용한 하이퍼스펙트럴 데이터의 무감독 변화탐지에 관한 연구)

  • Kim Dae-Sung;Kim Yong-Il
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2006.04a
    • /
    • pp.243-248
    • /
    • 2006
  • In this paper, we propose the unsupervised change detection algorithm that apply the similarity measure techniques to the hyperspectral image. The general similarity measures including euclidean distance and spectral angle were compared. The spectral similarity scale algorithm for reducing the problems of those techniques was studied and tested with Hyperion data. The thresholds for detecting the change area were estimated through EM(Expectation-Maximization) algorithm. The experimental result shows that the similarity measure techniques and EM algorithm can be applied effectively for the unsupervised change detection of the hyperspectral data.

  • PDF

Spectral clustering based on the local similarity measure of shared neighbors

  • Cao, Zongqi;Chen, Hongjia;Wang, Xiang
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.769-779
    • /
    • 2022
  • Spectral clustering has become a typical and efficient clustering method used in a variety of applications. The critical step of spectral clustering is the similarity measurement, which largely determines the performance of the spectral clustering method. In this paper, we propose a novel spectral clustering algorithm based on the local similarity measure of shared neighbors. This similarity measurement exploits the local density information between data points based on the weight of the shared neighbors in a directed k-nearest neighbor graph with only one parameter k, that is, the number of nearest neighbors. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed algorithm outperforms other existing spectral clustering algorithms in terms of the clustering performance measured via the normalized mutual information, clustering accuracy, and F-measure. As an example, the proposed method can provide an improvement of 15.82% in the clustering performance for the Soybean dataset.

Fuzzy similarity measure in Hypergraph

  • Lee, H.-Kwang
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1998.06a
    • /
    • pp.549-551
    • /
    • 1998
  • For a fuzzy system modeled by a fuzzy hypergraph, two fuzzy similarity measures are proposed : one for the fuzzy similarity between fuzzy sets and the other between elements in fuzzy sets. The propose measures can represent the realistic similarities which can not be given by the existing measures. With and example, it is shown that it can be used in the behavior analysis in an organization.

  • PDF

Improving the Performance of Document Clustering with Distributional Similarities (분포유사도를 이용한 문헌클러스터링의 성능향상에 대한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.4
    • /
    • pp.267-283
    • /
    • 2007
  • In this study, measures of distributional similarity such as KL-divergence are applied to cluster documents instead of traditional cosine measure, which is the most prevalent vector similarity measure for document clustering. Three variations of KL-divergence are investigated; Jansen-Shannon divergence, symmetric skew divergence, and minimum skew divergence. In order to verify the contribution of distributional similarities to document clustering, two experiments are designed and carried out on three test collections. In the first experiment the clustering performances of the three divergence measures are compared to that of cosine measure. The result showed that minimum skew divergence outperformed the other divergence measures as well as cosine measure. In the second experiment second-order distributional similarities are calculated with Pearson correlation coefficient from the first-order similarity matrixes. From the result of the second experiment, secondorder distributional similarities were found to improve the overall performance of document clustering. These results suggest that minimum skew divergence must be selected as document vector similarity measure when considering both time and accuracy, and second-order similarity is a good choice for considering clustering accuracy only.

Multi-Modal Based Malware Similarity Estimation Method (멀티모달 기반 악성코드 유사도 계산 기법)

  • Yoo, Jeong Do;Kim, Taekyu;Kim, In-sung;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.2
    • /
    • pp.347-363
    • /
    • 2019
  • Malware has its own unique behavior characteristics, like DNA for living things. To respond APT (Advanced Persistent Threat) attacks in advance, it needs to extract behavioral characteristics from malware. To this end, it needs to do classification for each malware based on its behavioral similarity. In this paper, various similarity of Windows malware is estimated; and based on these similarity values, malware's family is predicted. The similarity measures used in this paper are as follows: 'TF-IDF cosine similarity', 'Nilsimsa similarity', 'malware function cosine similarity' and 'Jaccard similarity'. As a result, we find the prediction rate for each similarity measure is widely different. Although, there is no similarity measure which can be applied to malware classification with high accuracy, this result can be helpful to select a similarity measure to classify specific malware family.