• 제목/요약/키워드: Improved similarity

검색결과 327건 처리시간 0.021초

Improved Collaborative Filtering Using Entropy Weighting

  • Kwon, Hyeong-Joon
    • International Journal of Advanced Culture Technology
    • /
    • 제1권2호
    • /
    • pp.1-6
    • /
    • 2013
  • In this paper, we evaluate performance of existing similarity measurement metric and propose a novel method using user's preferences information entropy to reduce MAE in memory-based collaborative recommender systems. The proposed method applies a similarity of individual inclination to traditional similarity measurement methods. We experiment on various similarity metrics under different conditions, which include an amount of data and significance weighting from n/10 to n/60, to verify the proposed method. As a result, we confirm the proposed method is robust and efficient from the viewpoint of a sparse data set, applying existing various similarity measurement methods and Significance Weighting.

  • PDF

온톨로지 트리기반 멀티에이전트 세만틱 유사도매칭 알고리즘 (A Multi-Agent Improved Semantic Similarity Matching Algorithm Based on Ontology Tree)

  • ;조영임
    • 제어로봇시스템학회논문지
    • /
    • 제18권11호
    • /
    • pp.1027-1033
    • /
    • 2012
  • Semantic-based information retrieval techniques understand the meanings of the concepts that users specify in their queries, but the traditional semantic matching methods based on the ontology tree have three weaknesses which may lead to many false matches, causing the falling precision. In order to improve the matching precision and the recall of the information retrieval, this paper proposes a multi-agent improved semantic similarity matching algorithm based on the ontology tree, which can avoid the considerable computation redundancies and mismatching during the entire matching process. The results of the experiments performed on our algorithm show improvements in precision and recall compared with the information retrieval techniques based on the traditional semantic similarity matching methods.

A Text Similarity Measurement Method Based on Singular Value Decomposition and Semantic Relevance

  • Li, Xu;Yao, Chunlong;Fan, Fenglong;Yu, Xiaoqiang
    • Journal of Information Processing Systems
    • /
    • 제13권4호
    • /
    • pp.863-875
    • /
    • 2017
  • The traditional text similarity measurement methods based on word frequency vector ignore the semantic relationships between words, which has become the obstacle to text similarity calculation, together with the high-dimensionality and sparsity of document vector. To address the problems, the improved singular value decomposition is used to reduce dimensionality and remove noises of the text representation model. The optimal number of singular values is analyzed and the semantic relevance between words can be calculated in constructed semantic space. An inverted index construction algorithm and the similarity definitions between vectors are proposed to calculate the similarity between two documents on the semantic level. The experimental results on benchmark corpus demonstrate that the proposed method promotes the evaluation metrics of F-measure.

유사성(類似性) 판단(判斷)과 검사수행도(檢査遂行度)에 관한 연구 (An Effect of Similarity Judgement on Human Performance in Inspection Tasks)

  • 손일문;이동춘;이상도
    • 품질경영학회지
    • /
    • 제20권2호
    • /
    • pp.109-117
    • /
    • 1992
  • An inspection task largely can be seen as a job divided up into a series of visual search and classification subtasks. In these subtasks, an Inspector must performs to compare the standard references proposed in visual environments and recalled in his memory with the visual stimuli to be inspected. It means that the judgement of similarity should be demanded on inspection tasks. Therefore, the inspector's ability for the judgement of similarity and the difference similarity between inspection materials are important factors to effect on performances in inspection tasks. In this paper, to analysis the effect of these factors on inspection time, an inspection task is designed and suggested by means of computer simulator. Especially, the skin conductance responses(SCR) of subjects are measured to evaluate the complexity of tasks due to the difference of similarity between materials. In the results of experiment, the more similar or different the difference of similarity between materials is, the shorter the inspection time is because of the reduction of task complexity. And, When the inspector's cognition for similarity between materials is consistanct, the inpsection time is improved. Concludingly, the consistency of reponses for similarity judgement becomes a measurement to present the performance levels. And the information of inspection time that due to the difference of similarity between materials must be considered in planning and scheduling inspection tasks.

  • PDF

Learning Discriminative Fisher Kernel for Image Retrieval

  • Wang, Bin;Li, Xiong;Liu, Yuncai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제7권3호
    • /
    • pp.522-538
    • /
    • 2013
  • Content based image retrieval has become an increasingly important research topic for its wide application. It is highly challenging when facing to large-scale database with large variance. The retrieval systems rely on a key component, the predefined or learned similarity measures over images. We note that, the similarity measures can be potential improved if the data distribution information is exploited using a more sophisticated way. In this paper, we propose a similarity measure learning approach for image retrieval. The similarity measure, so called Fisher kernel, is derived from the probabilistic distribution of images and is the function over observed data, hidden variable and model parameters, where the hidden variables encode high level information which are powerful in discrimination and are failed to be exploited in previous methods. We further propose a discriminative learning method for the similarity measure, i.e., encouraging the learned similarity to take a large value for a pair of images with the same label and to take a small value for a pair of images with distinct labels. The learned similarity measure, fully exploiting the data distribution, is well adapted to dataset and would improve the retrieval system. We evaluate the proposed method on Corel-1000, Corel5k, Caltech101 and MIRFlickr 25,000 databases. The results show the competitive performance of the proposed method.

Collaborative Filtering Algorithm Based on User-Item Attribute Preference

  • Ji, JiaQi;Chung, Yeongjee
    • Journal of information and communication convergence engineering
    • /
    • 제17권2호
    • /
    • pp.135-141
    • /
    • 2019
  • Collaborative filtering algorithms often encounter data sparsity issues. To overcome this issue, auxiliary information of relevant items is analyzed and an item attribute matrix is derived. In this study, we combine the user-item attribute preference with the traditional similarity calculation method to develop an improved similarity calculation approach and use weights to control the importance of these two elements. A collaborative filtering algorithm based on user-item attribute preference is proposed. The experimental results show that the performance of the recommender system is the most optimal when the weight of traditional similarity is equal to that of user-item attribute preference similarity. Although the rating-matrix is sparse, better recommendation results can be obtained by adding a suitable proportion of user-item attribute preference similarity. Moreover, the mean absolute error of the proposed approach is less than that of two traditional collaborative filtering algorithms.

Gated Recurrent Unit Architecture for Context-Aware Recommendations with improved Similarity Measures

  • Kala, K.U.;Nandhini, M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권2호
    • /
    • pp.538-561
    • /
    • 2020
  • Recommender Systems (RecSys) have a major role in e-commerce for recommending products, which they may like for every user and thus improve their business aspects. Although many types of RecSyss are there in the research field, the state of the art RecSys has focused on finding the user similarity based on sequence (e.g. purchase history, movie-watching history) analyzing and prediction techniques like Recurrent Neural Network in Deep learning. That is RecSys has considered as a sequence prediction problem. However, evaluation of similarities among the customers is challenging while considering temporal aspects, context and multi-component ratings of the item-records in the customer sequences. For addressing this issue, we are proposing a Deep Learning based model which learns customer similarity directly from the sequence to sequence similarity as well as item to item similarity by considering all features of the item, contexts, and rating components using Dynamic Temporal Warping(DTW) distance measure for dynamic temporal matching and 2D-GRU (Two Dimensional-Gated Recurrent Unit) architecture. This will overcome the limitation of non-linearity in the time dimension while measuring the similarity, and the find patterns more accurately and speedily from temporal and spatial contexts. Experiment on the real world movie data set LDOS-CoMoDa demonstrates the efficacy and promising utility of the proposed personalized RecSys architecture.

Learning Probabilistic Kernel from Latent Dirichlet Allocation

  • Lv, Qi;Pang, Lin;Li, Xiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권6호
    • /
    • pp.2527-2545
    • /
    • 2016
  • Measuring the similarity of given samples is a key problem of recognition, clustering, retrieval and related applications. A number of works, e.g. kernel method and metric learning, have been contributed to this problem. The challenge of similarity learning is to find a similarity robust to intra-class variance and simultaneously selective to inter-class characteristic. We observed that, the similarity measure can be improved if the data distribution and hidden semantic information are exploited in a more sophisticated way. In this paper, we propose a similarity learning approach for retrieval and recognition. The approach, termed as LDA-FEK, derives free energy kernel (FEK) from Latent Dirichlet Allocation (LDA). First, it trains LDA and constructs kernel using the parameters and variables of the trained model. Then, the unknown kernel parameters are learned by a discriminative learning approach. The main contributions of the proposed method are twofold: (1) the method is computationally efficient and scalable since the parameters in kernel are determined in a staged way; (2) the method exploits data distribution and semantic level hidden information by means of LDA. To evaluate the performance of LDA-FEK, we apply it for image retrieval over two data sets and for text categorization on four popular data sets. The results show the competitive performance of our method.

개선된 비디오 장면 유사도 검출 알고리즘 (Improved Similarity Detection Algorithm of the Video Scene)

  • 유주원;김종원;최종욱;배경율
    • 한국콘텐츠학회논문지
    • /
    • 제9권2호
    • /
    • pp.43-50
    • /
    • 2009
  • 본 연구에서는 고유의 비디오 프레임의 특징 데이터를 추출하고 추출된 특징 데이터를 1차 신호로 생성하여 유사한 비디오 프레임 데이터를 검출하는 방법에 관하여 연구하였다. 비디오 간의 유사도 검출을 위하여 유사한 프레임간의 경계를 얻어낸 후 경계 범위 내에서 대표 프레임을 얻어낸다. 생성된 대표 프레임으로부터 blurring 된 프레임을 생성하고, DOG 값을 이용하여 특징 데이터를 추출한다. 이렇게 생성된 특징 데이터를 1차원 신호로 나열하고 콘텐츠 간 유사도를 비교한다. 실험 결과 잡음 첨가, 회전 변환, 크기 변환, 프레임 절삭, 프레임 제거 공격에 대해서도 유사도 수치 0.9 이상의 매우 강인한 특성을 나타냈다.

협력필터링의 데이터 희소성 해결을 위한 자카드 지수 반영의 유사도 성능 분석 (Performance Analysis of Similarity Reflecting Jaccard Index for Solving Data Sparsity in Collaborative Filtering)

  • 이수정
    • 컴퓨터교육학회논문지
    • /
    • 제19권4호
    • /
    • pp.59-66
    • /
    • 2016
  • 협력 필터링 시스템에서 데이터 희소성 문제의 해결을 위해 공통평가항목수를 반영하는 방법이 연구되었다. 이러한 방법으로 널리 알려진 자카드 지수는 기존의 유사도 척도와 결합되어 성능을 개선할 수 있었다. 그러나, 다양한 데이터 환경에서 여러 유사도 척도들과 각각 결합했을 때의 성능 개선 효과에 대한 분석 연구는 미미하므로, 본 연구는 이에 대한 분석을 목적으로 한다. 우선 자카드 지수 자체를 유사도 척도로 사용했을때 희소한 데이터셋 상에서 전통적인 척도들보다 월등한 예측 성능을 보였고 추천 성능도 매우 우수하였다. 자카드 지수를 결합함으로써 기존 유사도 척도는 데이터 특성에 상관없이 성능이 대개 향상되었고, 특히 코사인 유사도는 희소한 데이터셋에서 가장 큰 향상을 이루었으나, 평균차이 제곱(Mean Squared Difference)의 유사도는 밀집된 데이터셋에서 오히려 저하된 예측 성능을 보였다. 따라서, 자카드 지수를 결합하여 사용하기 위해 데이터 환경 특성과 유사도 척도를 고려할 필요가 있다.