• Title/Summary/Keyword: Graph Databases

Search Result 91, Processing Time 0.03 seconds

The Status Quo of Graph Databases in Construction Research

  • Jeon, Kahyun;Lee, Ghang
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.800-807
    • /
    • 2022
  • This study aims to review the use of graph databases in construction research. Based on the diagnosis of the current research status, a future research direction is proposed. The use of graph databases in construction research has been increasing because of the efficiency in expressing complex relations between entities in construction big data. However, no study has been conducted to review systematically the status quo of graph databases. This study analyzes 42 papers in total that deployed a graph model and graph database in construction research, both quantitatively and qualitatively. A keyword analysis, topic modeling, and qualitative content analysis were conducted. The review identified the research topics, types of data sources that compose a graph, and the graph database application methods and algorithms. Although the current research is still in a nascent stage, the graph database research has great potential to develop into an advanced stage, fused with artificial intelligence (AI) in the future, based on the active usage trends this study revealed.

  • PDF

Graph Database Benchmarking Systems Supporting Diversity (다양성을 지원하는 그래프 데이터베이스 벤치마킹 시스템)

  • Choi, Do-Jin;Baek, Yeon-Hee;Lee, So-Min;Kim, Yun-A;Kim, Nam-Young;Choi, Jae-Young;Lee, Hyeon-Byeong;Lim, Jong-Tae;Bok, Kyoung-Soo;Song, Seok-Il;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.12
    • /
    • pp.84-94
    • /
    • 2021
  • Graph databases have been developed to efficiently store and query graph data composed of vertices and edges to express relationships between objects. Since the query types of graph database show very different characteristics from traditional NoSQL databases, benchmarking tools suitable for graph databases to verify the performance of the graph database are needed. In this paper, we propose an efficient graph database benchmarking system that supports diversity in graph inputs and queries. The proposed system utilizes OrientDB to conduct benchmarking for graph databases. In order to support the diversity of input graphs and query graphs, we use LDBC that is an existing graph data generation tool. We demonstrate the feasibility and effectiveness of the proposed scheme through analysis of benchmarking results. As a result of performance evaluation, it has been shown that the proposed system can generate customizable synthetic graph data, and benchmarking can be performed based on the generated graph data.

Application Plan of Graph Databases in the Big Data Environment (빅데이터환경에서의 그래프데이터베이스 활용방안)

  • Park, Sungbum;Lee, Sangwon;Ahn, Hyunsup;Jung, In-Hwan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.247-249
    • /
    • 2013
  • Even though Relational Databases have been widely used in many enterprises, the relations among entities are not managed effectively and efficiently. In order to analyze Big Data, it is absolutely needed to express various relations among entities in a graphical form. In this paper, we define Graph Databases and its structure. And then, we check out their characteristics such as transaction, consistency, availability, retrieval function, and expandability. Also, we appropriate or inappropriate subjects for application of Graph Databases.

  • PDF

Use of Graph Database for the Integration of Heterogeneous Biological Data

  • Yoon, Byoung-Ha;Kim, Seon-Kyu;Kim, Seon-Young
    • Genomics & Informatics
    • /
    • v.15 no.1
    • /
    • pp.19-27
    • /
    • 2017
  • Understanding complex relationships among heterogeneous biological data is one of the fundamental goals in biology. In most cases, diverse biological data are stored in relational databases, such as MySQL and Oracle, which store data in multiple tables and then infer relationships by multiple-join statements. Recently, a new type of database, called the graph-based database, was developed to natively represent various kinds of complex relationships, and it is widely used among computer science communities and IT industries. Here, we demonstrate the feasibility of using a graph-based database for complex biological relationships by comparing the performance between MySQL and Neo4j, one of the most widely used graph databases. We collected various biological data (protein-protein interaction, drug-target, gene-disease, etc.) from several existing sources, removed duplicate and redundant data, and finally constructed a graph database containing 114,550 nodes and 82,674,321 relationships. When we tested the query execution performance of MySQL versus Neo4j, we found that Neo4j outperformed MySQL in all cases. While Neo4j exhibited a very fast response for various queries, MySQL exhibited latent or unfinished responses for complex queries with multiple-join statements. These results show that using graph-based databases, such as Neo4j, is an efficient way to store complex biological relationships. Moreover, querying a graph database in diverse ways has the potential to reveal novel relationships among heterogeneous biological data.

Structural Analysis and Performance Test of Graph Databases using Relational Data (관계형데이터를 이용한 그래프 데이터베이스의 모델별 구조 분석과 쿼리 성능 비교 연구)

  • Bae, Suk Min;Kim, Jin Hyung;Yoo, Jae Min;Yang, Seong Ryul;Jung, Jai Jin
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.9
    • /
    • pp.1036-1045
    • /
    • 2019
  • Relational databases have a notion of normalization, in which the model for storing data is standardized according to the organization's business processes or data operations. However, the graph database is relatively early in this standardization and has a high degree of freedom in modeling. Therefore various models can be created with the same data, depending on the database designers. The essences of the graph database are two aspects. First, the graph database allows accessing relationships between the objects semantically. Second, it makes relationships between entities as important as individual data. Thus increasing the degree of freedom in modeling and providing the modeling developers with a more creative system. This paper introduces different graph models with test data. It compares the query performances by the results of response speeds to the query executions per graph model to find out how the efficiency of each model can be maximized.

Combining Local and Global Features to Reduce 2-Hop Label Size of Directed Acyclic Graphs

  • Ahn, Jinhyun;Im, Dong-Hyuk
    • Journal of Information Processing Systems
    • /
    • v.16 no.1
    • /
    • pp.201-209
    • /
    • 2020
  • The graph data structure is popular because it can intuitively represent real-world knowledge. Graph databases have attracted attention in academia and industry because they can be used to maintain graph data and allow users to mine knowledge. Mining reachability relationships between two nodes in a graph, termed reachability query processing, is an important functionality of graph databases. Online traversals, such as the breadth-first and depth-first search, are inefficient in processing reachability queries when dealing with large-scale graphs. Labeling schemes have been proposed to overcome these disadvantages. The state-of-the-art is the 2-hop labeling scheme: each node has in and out labels containing reachable node IDs as integers. Unfortunately, existing 2-hop labeling schemes generate huge 2-hop label sizes because they only consider local features, such as degrees. In this paper, we propose a more efficient 2-hop label size reduction approach. We consider the topological sort index, which is a global feature. A linear combination is suggested for utilizing both local and global features. We conduct experiments over real-world and synthetic directed acyclic graph datasets and show that the proposed approach generates smaller labels than existing approaches.

A Metabolic Pathway Drawing Algorithm for Reducing the Number of Edge Crossings

  • Song Eun-Ha;Kim Min-Kyung;Lee Sang-Ho
    • Genomics & Informatics
    • /
    • v.4 no.3
    • /
    • pp.118-124
    • /
    • 2006
  • For the direct understanding of flow, pathway data are usually represented as directed graphs in biological journals and texts. Databases of metabolic pathways or signal transduction pathways inevitably contain these kinds of graphs to show the flow. KEGG, one of the representative pathway databases, uses the manually drawn figure which can not be easily maintained. Graph layout algorithms are applied for visualizing metabolic pathways in some databases, such as EcoCyc. Although these can express any changes of data in the real time, it exponentially increases the edge crossings according to the increase of nodes. For the understanding of genome scale flow of metabolism, it is very important to reduce the unnecessary edge crossings which exist in the automatic graph layout. We propose a metabolic pathway drawing algorithm for reducing the number of edge crossings by considering the fact that metabolic pathway graph is scale-free network. The experimental results show that the number of edge crossings is reduced about $37{\sim}40%$ by the consideration of scale-free network in contrast with non-considering scale-free network. And also we found that the increase of nodes do not always mean that there is an increase of edge crossings.

Graph Database Design and Implementation for Ransomware Detection (랜섬웨어 탐지를 위한 그래프 데이터베이스 설계 및 구현)

  • Choi, Do-Hyeon
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.6
    • /
    • pp.24-32
    • /
    • 2021
  • Recently, ransomware attacks have been infected through various channels such as e-mail, phishing, and device hacking, and the extent of the damage is increasing rapidly. However, existing known malware (static/dynamic) analysis engines are very difficult to detect/block against novel ransomware that has evolved like Advanced Persistent Threat (APT) attacks. This work proposes a method for modeling ransomware malicious behavior based on graph databases and detecting novel multi-complex malicious behavior for ransomware. Studies confirm that pattern detection of ransomware is possible in novel graph database environments that differ from existing relational databases. Furthermore, we prove that the associative analysis technique of graph theory is significantly efficient for ransomware analysis performance.

A Horizontal Partition of the Object-Oriented Database for Efficient Clustering

  • Chung, Chin-Wan;Kim, Chang-Ryong;Lee, Ju-Hong
    • Journal of Electrical Engineering and information Science
    • /
    • v.1 no.1
    • /
    • pp.164-172
    • /
    • 1996
  • The partitioning of related objects should be performed before clustering for an efficient access in object-oriented databases. In this paper, a horizontal partition of related objects in object-oriented databases is presented. All subclass nodes in a class inheritance hierarchy of a schema graph are shrunk to a class node in the graph that is called condensed schema graph because the aggregation hierarchy has more influence on the partition than the class inheritance hierarchy. A set function and an accessibility function are defined to find a maximal subset of related objects among the set of objects in a class. A set function maps a subset of the domain class objects to a subset of the range class objects. An accessibility function maps a subset of the objects of a class into a subset of the objects of the same class through a composition of set functions. The algorithm derived in this paper is to find the related objects of a condensed schema graph using accessibility functions and set functions. The existence of a maximal subset of the related objects in a class is proved to show the validity of the partition algorithm using the accessibility function.

  • PDF

Contribution to Improve Database Classification Algorithms for Multi-Database Mining

  • Miloudi, Salim;Rahal, Sid Ahmed;Khiat, Salim
    • Journal of Information Processing Systems
    • /
    • v.14 no.3
    • /
    • pp.709-726
    • /
    • 2018
  • Database classification is an important preprocessing step for the multi-database mining (MDM). In fact, when a multi-branch company needs to explore its distributed data for decision making, it is imperative to classify these multiple databases into similar clusters before analyzing the data. To search for the best classification of a set of n databases, existing algorithms generate from 1 to ($n^2-n$)/2 candidate classifications. Although each candidate classification is included in the next one (i.e., clusters in the current classification are subsets of clusters in the next classification), existing algorithms generate each classification independently, that is, without taking into account the use of clusters from the previous classification. Consequently, existing algorithms are time consuming, especially when the number of candidate classifications increases. To overcome the latter problem, we propose in this paper an efficient approach that represents the problem of classifying the multiple databases as a problem of identifying the connected components of an undirected weighted graph. Theoretical analysis and experiments on public databases confirm the efficiency of our algorithm against existing works and that it overcomes the problem of increase in the execution time.