• Title/Summary/Keyword: Node Similarity

Search Result 83, Processing Time 0.027 seconds

Community Detection using Closeness Similarity based on Common Neighbor Node Clustering Entropy

  • Jiang, Wanchang;Zhang, Xiaoxi;Zhu, Weihua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2587-2605
    • /
    • 2022
  • In order to efficiently detect community structure in complex networks, community detection algorithms can be designed from the perspective of node similarity. However, the appropriate parameters should be chosen to achieve community division, furthermore, these existing algorithms based on the similarity of common neighbors have low discrimination between node pairs. To solve the above problems, a noval community detection algorithm using closeness similarity based on common neighbor node clustering entropy is proposed, shorted as CSCDA. Firstly, to improve detection accuracy, common neighbors and clustering coefficient are combined in the form of entropy, then a new closeness similarity measure is proposed. Through the designed similarity measure, the closeness similar node set of each node can be further accurately identified. Secondly, to reduce the randomness of the community detection result, based on the closeness similar node set, the node leadership is used to determine the most closeness similar first-order neighbor node for merging to create the initial communities. Thirdly, for the difficult problem of parameter selection in existing algorithms, the merging of two levels is used to iteratively detect the final communities with the idea of modularity optimization. Finally, experiments show that the normalized mutual information values are increased by an average of 8.06% and 5.94% on two scales of synthetic networks and real-world networks with real communities, and modularity is increased by an average of 0.80% on the real-world networks without real communities.

A Study on the Performance of Similarity Indices and its Relationship with Link Prediction: a Two-State Random Network Case

  • Ahn, Min-Woo;Jung, Woo-Sung
    • Journal of the Korean Physical Society
    • /
    • v.73 no.10
    • /
    • pp.1589-1595
    • /
    • 2018
  • Similarity index measures the topological proximity of node pairs in a complex network. Numerous similarity indices have been defined and investigated, but the dependency of structure on the performance of similarity indices has not been sufficiently investigated. In this study, we investigated the relationship between the performance of similarity indices and structural properties of a network by employing a two-state random network. A node in a two-state network has binary types that are initially given, and a connection probability is determined from the state of the node pair. The performances of similarity indices are affected by the number of links and the ratio of intra-connections to inter-connections. Similarity indices have different characteristics depending on their type. Local indices perform well in small-size networks and do not depend on whether the structure is intra-dominant or inter-dominant. In contrast, global indices perform better in large-size networks, and some such indices do not perform well in an inter-dominant structure. We also found that link prediction performance and the performance of similarity are correlated in both model networks and empirical networks. This relationship implies that link prediction performance can be used as an approximation for the performance of the similarity index when information about node type is unavailable. This relationship may help to find the appropriate index for given networks.

Similarity Analysis of Sibling Nodes in SNOMED CT Terminology System (SNOMED CT 용어체계에서 형제 노드의 유사도 분석 기법)

  • Woo-Seok Ryu
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.295-300
    • /
    • 2024
  • This paper discusses the incompleteness of the SNOMED CT and proposes a noble metric which evaluates similarity among sibling nodes as a method to address this incompleteness. SNOMED CT encompasses an extensive range of medical terms, but it faces issues of ontology incompleteness, such as missing concepts in the hierarchy. We propose a noble metric for evaluating similarity among nodes within a node group, composed of multiple sibling nodes, to identify missing concepts, and identify groups with low similarity. Analyzing the similarity of sibling node groups in the March 2023 international release of SNOMED CT, the average similarity of 29,199 sibling node groups, which are sub-concepts of the clinical finding concept and are consist of two or more sibling nodes, was found to be 0.81. The group with the lowest similarity was associated with child concepts of poisoning, with a similarity of 0.0036.

A METHOD OF IMAGE DATA RETRIEVAL BASED ON SELF-ORGANIZING MAPS

  • Lee, Mal-Rey;Oh, Jong-Chul
    • Journal of applied mathematics & informatics
    • /
    • v.9 no.2
    • /
    • pp.793-806
    • /
    • 2002
  • Feature-based similarity retrieval become an important research issue in image database systems. The features of image data are useful to discrimination of images. In this paper, we propose the highspeed k-Nearest Neighbor search algorithm based on Self-Organizing Maps. Self-Organizing Maps (SOM) provides a mapping from high dimensional feature vectors onto a two-dimensional space. The mapping preserves the topology of the feature vectors. The map is called topological feature map. A topological feature map preserves the mutual relations (similarity) in feature spaces of input data. and clusters mutually similar feature vectors in a neighboring nodes. Each node of the topological feature map holds a node vector and similar images that is closest to each node vector. In topological feature map, there are empty nodes in which no image is classified. We experiment on the performance of our algorithm using color feature vectors extracted from images. Promising results have been obtained in experiments.

A Study on the Performance of Structured Document Retrieval Using Node Information (노드정보를 이용한 문서검색의 성능에 관한 연구)

  • Yoon, So-Young
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.1 s.63
    • /
    • pp.103-120
    • /
    • 2007
  • Node is the semantic unit and a part of structured document. Information retrieval from structured documents offers an opportunity to go subdivided below the document level in search of relevant information, making any element in an structured document a retrievable unit. The node-based document retrieval constitutes several similarity calculating methods and the extended node retrieval method using structure information. Retrieval performance is hardly influenced by the methods for determining document similarity The extended node method outperformed the others as a whole.

A Tree-Compare Algorithm for Similarity Evaluation (유사도 평가를 위한 트리 비교 알고리즘)

  • Kim, Young-Chul;Yoo, Chae-Woo
    • The KIPS Transactions:PartA
    • /
    • v.11A no.2
    • /
    • pp.159-164
    • /
    • 2004
  • In the previous researches, tree comparison methods are almost studied in comparing weighted or labeled tree(decorated tree). But in this paper, we propose a tree comparison and similarity evaluation algorithm can be applied to comparison of two normal trees. The algorithm converts two trees into node string using unparser, evaluates similarity and finally return similarity value from 0.0 to 1.0. In the experiment part of this paper, we visually presented matched nodes and unmatched nodes between two trees. By using this tree similarity algorithm, we can not only evaluate similarity between two specific programs or documents but also detect duplicated code.

The transmission Network clustering using a fuzzy entropy function (퍼지 엔트로피 함수를 이용한 송전 네트워크 클러스터링)

  • Jang, Se-Hwan;Kim, Jin-Ho;Lee, Sang-Hyuk;Park, Jun-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2006.11a
    • /
    • pp.225-227
    • /
    • 2006
  • The transmission network clustering using a fuzzy entropy function are proposed in this paper. We can define a similarity measure through a fuzzy entropy. All node in the transmission network system has its own values indicating the physical characteristics of that system and the similarity measure in this paper is defined through the system-wide characteristic values at each node. However, to tackle the geometric mis-clustering problem, that is, to avoid the clustering of geometrically distant locations with similar measures, the locational informations are properly considered and incorporated in the proposed similarity measure. In this paper, a new regional clustering measure for the transmission network system is proposed and proved. The proposed measure is verified through IEEE 39 bus system.

  • PDF

Semantic Similarity Search using the Signature Tree (시그니처 트리를 사용한 의미적 유사성 검색 기법)

  • Kim, Ki-Sung;Im, Dong-Hyuk;Kim, Cheol-Han;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.34 no.6
    • /
    • pp.546-553
    • /
    • 2007
  • As ontologies are used widely, interest for semantic similarity search is also increasing. In this paper, we suggest a query evaluation scheme for k-nearest neighbor query, which retrieves k most similar objects to the query object. We use the best match method to calculate the semantic similarity between objects and use the signature tree to index annotation information of objects in database. The signature tree is usually used for the set similarity search. When we use the signature tree in similarity search, we are required to predict the upper-bound of similarity for a node; the highest similarity value which can be found when we traverse into the node. So we suggest a prediction function for the best match similarity function and prove the correctness of the prediction. And we modify the original signature tree structure for same signatures not to be stored redundantly. This improved structure of signature tree not only reduces the size of signature tree but also increases the efficiency of query evaluation. We use the Gene Ontology(GO) for our experiments, which provides large ontologies and large amount of annotation data. Using GO, we show that proposed method improves query efficiency and present several experimental results varying the page size and using several node-splitting methods.

Similarity Measure and Clustering Technique for XML Documents by a Parent-Child Matrix (부모-자식 행렬을 사용한 XML 문서 유사도 측정과 군집 기법)

  • Lee, Yun-Gu;Kim, Woosaeng
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.7
    • /
    • pp.1599-1607
    • /
    • 2015
  • Recently, researches have been developing efficient techniques for accessing, querying, and managing XML documents which are frequently used in the Internet. In this paper, we propose a parent-child matrix to cluster XML documents efficiently. A parent-child matrix analyzes both the content and structural features of an XML document. Each cell of a parent-child matrix has either the value of a node in an XML tree or the value of a child node, where a parent-child relationship exists in the XML tree. Then, the similarity between two XML documents can be measured by the similarity between two corresponding parent-child matrices. The experiment shows that our proposed method has good performance.

Community Discovery in Weighted Networks Based on the Similarity of Common Neighbors

  • Liu, Miaomiao;Guo, Jingfeng;Chen, Jing
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1055-1067
    • /
    • 2019
  • In view of the deficiencies of existing weighted similarity indexes, a hierarchical clustering method initialize-expand-merge (IEM) is proposed based on the similarity of common neighbors for community discovery in weighted networks. Firstly, the similarity of the node pair is defined based on the attributes of their common neighbors. Secondly, the most closely related nodes are fast clustered according to their similarity to form initial communities and expand the communities. Finally, communities are merged through maximizing the modularity so as to optimize division results. Experiments are carried out on many weighted networks, which have verified the effectiveness of the proposed algorithm. And results show that IEM is superior to weighted common neighbor (CN), weighted Adamic-Adar (AA) and weighted resources allocation (RA) when using the weighted modularity as evaluation index. Moreover, the proposed algorithm can achieve more reasonable community division for weighted networks compared with cluster-recluster-merge-algorithm (CRMA) algorithm.