• Title/Summary/Keyword: Graph Data

Search Result 1,293, Processing Time 0.026 seconds

An Efficient Large Graph Clustering Technique based on Min-Hash (Min-Hash를 이용한 효율적인 대용량 그래프 클러스터링 기법)

  • Lee, Seok-Joo;Min, Jun-Ki
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.380-388
    • /
    • 2016
  • Graph clustering is widely used to analyze a graph and identify the properties of a graph by generating clusters consisting of similar vertices. Recently, large graph data is generated in diverse applications such as Social Network Services (SNS), the World Wide Web (WWW), and telephone networks. Therefore, the importance of graph clustering algorithms that process large graph data efficiently becomes increased. In this paper, we propose an effective clustering algorithm which generates clusters for large graph data efficiently. Our proposed algorithm effectively estimates similarities between clusters in graph data using Min-Hash and constructs clusters according to the computed similarities. In our experiment with real-world data sets, we demonstrate the efficiency of our proposed algorithm by comparing with existing algorithms.

A Methodology for Searching Frequent Pattern Using Graph-Mining Technique (그래프마이닝을 활용한 빈발 패턴 탐색에 관한 연구)

  • Hong, June Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.1
    • /
    • pp.65-75
    • /
    • 2019
  • As the use of semantic web based on XML increases in the field of data management, a lot of studies to extract useful information from the data stored in ontology have been tried based on association rule mining. Ontology data is advantageous in that data can be freely expressed because it has a flexible and scalable structure unlike a conventional database having a predefined structure. On the contrary, it is difficult to find frequent patterns in a uniformized analysis method. The goal of this study is to provide a basis for extracting useful knowledge from ontology by searching for frequently occurring subgraph patterns by applying transaction-based graph mining techniques to ontology schema graph data and instance graph data constituting ontology. In order to overcome the structural limitations of the existing ontology mining, the frequent pattern search methodology in this study uses the methodology used in graph mining to apply the frequent pattern in the graph data structure to the ontology by applying iterative node chunking method. Our suggested methodology will play an important role in knowledge extraction.

Efficient Mining of Frequent Subgraph with Connectivity Constraint

  • Moon, Hyun-S.;Lee, Kwang-H.;Lee, Do-Heon
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.267-271
    • /
    • 2005
  • The goal of data mining is to extract new and useful knowledge from large scale datasets. As the amount of available data grows explosively, it became vitally important to develop faster data mining algorithms for various types of data. Recently, an interest in developing data mining algorithms that operate on graphs has been increased. Especially, mining frequent patterns from structured data such as graphs has been concerned by many research groups. A graph is a highly adaptable representation scheme that used in many domains including chemistry, bioinformatics and physics. For example, the chemical structure of a given substance can be modelled by an undirected labelled graph in which each node corresponds to an atom and each edge corresponds to a chemical bond between atoms. Internet can also be modelled as a directed graph in which each node corresponds to an web site and each edge corresponds to a hypertext link between web sites. Notably in bioinformatics area, various kinds of newly discovered data such as gene regulation networks or protein interaction networks could be modelled as graphs. There have been a number of attempts to find useful knowledge from these graph structured data. One of the most powerful analysis tool for graph structured data is frequent subgraph analysis. Recurring patterns in graph data can provide incomparable insights into that graph data. However, to find recurring subgraphs is extremely expensive in computational side. At the core of the problem, there are two computationally challenging problems. 1) Subgraph isomorphism and 2) Enumeration of subgraphs. Problems related to the former are subgraph isomorphism problem (Is graph A contains graph B?) and graph isomorphism problem(Are two graphs A and B the same or not?). Even these simplified versions of the subgraph mining problem are known to be NP-complete or Polymorphism-complete and no polynomial time algorithm has been existed so far. The later is also a difficult problem. We should generate all of 2$^n$ subgraphs if there is no constraint where n is the number of vertices of the input graph. In order to find frequent subgraphs from larger graph database, it is essential to give appropriate constraint to the subgraphs to find. Most of the current approaches are focus on the frequencies of a subgraph: the higher the frequency of a graph is, the more attentions should be given to that graph. Recently, several algorithms which use level by level approaches to find frequent subgraphs have been developed. Some of the recently emerging applications suggest that other constraints such as connectivity also could be useful in mining subgraphs : more strongly connected parts of a graph are more informative. If we restrict the set of subgraphs to mine to more strongly connected parts, its computational complexity could be decreased significantly. In this paper, we present an efficient algorithm to mine frequent subgraphs that are more strongly connected. Experimental study shows that the algorithm is scaling to larger graphs which have more than ten thousand vertices.

  • PDF

Efficient Storage Management Scheme for Graph Historical Retrieval (그래프 이력 데이터 접근을 위한 효과적인 저장 관리 기법)

  • Kim, Gihoon;Kim, Ina;Choi, Dojin;Kim, Minsoo;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.2
    • /
    • pp.438-449
    • /
    • 2018
  • Recently, various graph data have been utilized in various fields such as social networks and citation networks. As the graph changes dynamically over time, it is necessary to manage the graph historical data for tracking changes and retrieving point-in-time graphs. Most historical data changes partially according to time, so unchanged data is stored redundantly when data is stored in units of time. In this paper, we propose a graph history storage management method to minimize the redundant storage of time graphs. The proposed method continuously detects the change of the graph and stores the overlapping subgraph in intersection snapshot. Intersection snapshots are connected by a number of delta snapshots to maintain change data over time. It improves space efficiency by collectively managing overlapping data stored in intersection snapshots. We also linked intersection snapshots and delta snapshots to retrieval the graph at that point in time. Various performance evaluations are performed to show the superiority of the proposed scheme.

A Path Partitioning Technique for Indexing XML Data (XML 데이타 색인을 위한 경로 분할 기법)

  • 김종익;김형주
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.320-330
    • /
    • 2004
  • Query languages for XML use paths in a data graph to represent queries. Actually, paths in a data graph are used as a basic constructor of an XML query. User can write more expressive Queries by using Patterns (e.g. regular expressions) for paths. There are many identical paths in a data graph because of the feature of semi-structured data. Current researches for indexing XML utilize identical paths in a data graph, but such an index can grow larger than source data graph and cannot guarantee efficient access path. In this paper we propose a partitioning technique that can partition all the paths in a data graph. We develop an index graph that can find appropriate partitions for a path query efficiently. The size of our index graph can be adjusted regardless of the source data. So, we can significantly improve the cost for index graph traversals. In the performance study, we show our index much faster than other graph based indexes.

Using an educational software Graphers in elementary school mathematics (초등 수학 수업에서의 소프트웨어(Graphers) 활용)

  • 황혜정
    • School Mathematics
    • /
    • v.1 no.2
    • /
    • pp.555-569
    • /
    • 1999
  • The graph unit(chapter) is a good example of a topic in elementary school mathematics for which computer use can be incorporated as part of the instruction. Teaching graph can be facilitated by using the graphing utilities of computers, which make it possible to observe the property of many types of graphs. This study was concerned with utilizing an educational software Graphers as an instructional tool in teaching to help young students gain a better understanding of graph concepts. For this purpose, three types of instructional activities using Graphers were shown in the paper. Graphers is a data-gathering tool for creating pictorial data chosen from several data sets. They can represent their data on a table or with six types of graphs such as Pictograph, Bar Graph, Line Graph, Circle Graph, Grid Plot and Loops. They help students to select the graph(s) which are the most appropriate for the purpose of analyzing data while comparing various types of graphs. They also let them modify or change graphs, such as adding grid lines, changing the axis scale, or adding title and labels. Eventually, students have a chance to interpret graphs meaningfully and in their own way.

  • PDF

Knowledge Recommendation Based on Dual Channel Hypergraph Convolution

  • Yue Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.11
    • /
    • pp.2903-2923
    • /
    • 2023
  • Knowledge recommendation is a type of recommendation system that recommends knowledge content to users in order to satisfy their needs. Although using graph neural networks to extract data features is an effective method for solving the recommendation problem, there is information loss when modeling real-world problems because an edge in a graph structure can only be associated with two nodes. Because one super-edge in the hypergraph structure can be connected with several nodes and the effectiveness of knowledge graph for knowledge expression, a dual-channel hypergraph convolutional neural network model (DCHC) based on hypergraph structure and knowledge graph is proposed. The model divides user data and knowledge data into user subhypergraph and knowledge subhypergraph, respectively, and extracts user data features by dual-channel hypergraph convolution and knowledge data features by combining with knowledge graph technology, and finally generates recommendation results based on the obtained user embedding and knowledge embedding. The performance of DCHC model is higher than the comparative model under AUC and F1 evaluation indicators, comparative experiments with the baseline also demonstrate the validity of DCHC model.

The Status Quo of Graph Databases in Construction Research

  • Jeon, Kahyun;Lee, Ghang
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.800-807
    • /
    • 2022
  • This study aims to review the use of graph databases in construction research. Based on the diagnosis of the current research status, a future research direction is proposed. The use of graph databases in construction research has been increasing because of the efficiency in expressing complex relations between entities in construction big data. However, no study has been conducted to review systematically the status quo of graph databases. This study analyzes 42 papers in total that deployed a graph model and graph database in construction research, both quantitatively and qualitatively. A keyword analysis, topic modeling, and qualitative content analysis were conducted. The review identified the research topics, types of data sources that compose a graph, and the graph database application methods and algorithms. Although the current research is still in a nascent stage, the graph database research has great potential to develop into an advanced stage, fused with artificial intelligence (AI) in the future, based on the active usage trends this study revealed.

  • PDF

Structural Analysis and Performance Test of Graph Databases using Relational Data (관계형데이터를 이용한 그래프 데이터베이스의 모델별 구조 분석과 쿼리 성능 비교 연구)

  • Bae, Suk Min;Kim, Jin Hyung;Yoo, Jae Min;Yang, Seong Ryul;Jung, Jai Jin
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.9
    • /
    • pp.1036-1045
    • /
    • 2019
  • Relational databases have a notion of normalization, in which the model for storing data is standardized according to the organization's business processes or data operations. However, the graph database is relatively early in this standardization and has a high degree of freedom in modeling. Therefore various models can be created with the same data, depending on the database designers. The essences of the graph database are two aspects. First, the graph database allows accessing relationships between the objects semantically. Second, it makes relationships between entities as important as individual data. Thus increasing the degree of freedom in modeling and providing the modeling developers with a more creative system. This paper introduces different graph models with test data. It compares the query performances by the results of response speeds to the query executions per graph model to find out how the efficiency of each model can be maximized.

GOMS: Large-scale ontology management system using graph databases

  • Lee, Chun-Hee;Kang, Dong-oh
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.780-793
    • /
    • 2022
  • Large-scale ontology management is one of the main issues when using ontology data practically. Although many approaches have been proposed in relational database management systems (RDBMSs) or object-oriented DBMSs (OODBMSs) to develop large-scale ontology management systems, they have several limitations because ontology data structures are intrinsically different from traditional data structures in RDBMSs or OODBMSs. In addition, users have difficulty using ontology data because many terminologies (ontology nodes) in large-scale ontology data match with a given string keyword. Therefore, in this study, we propose a (graph database-based ontology management system (GOMS) to efficiently manage large-scale ontology data. GOMS uses a graph DBMS and provides new query templates to help users find key concepts or instances. Furthermore, to run queries with multiple joins and path conditions efficiently, we propose GOMS encoding as a filtering tool and develop hash-based join processing algorithms in the graph DBMS. Finally, we experimentally show that GOMS can process various types of queries efficiently.