• Title/Summary/Keyword: Graph Databases

Search Result 91, Processing Time 0.022 seconds

Graph Database Solution for Higher Order Spatial Statistics in the Era of Big Data

  • Sabiu, Cristiano G.;Kim, Juhan
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.1
    • /
    • pp.79.1-79.1
    • /
    • 2019
  • We present an algorithm for the fast computation of the general N-point spatial correlation functions of any discrete point set embedded within an Euclidean space of ${\mathbb{R}}n$. Utilizing the concepts of kd-trees and graph databases, we describe how to count all possible N-tuples in binned configurations within a given length scale, e.g. all pairs of points or all triplets of points with side lengths < rmax. Through benchmarking we show the computational advantage of our new graph-based algorithm over more traditional methods. We show that all 3-point configurations up to and beyond the Baryon Acoustic Oscillation scale (~200 Mpc in physical units) can be performed on current Sloan Digital Sky Survey (SDSS) data in reasonable time. Finally we present the first measurements of the 4-point correlation function of ~0.5 million SDSS galaxies over the redshift range 0.43< z <0.7. We present the publicly available code GRAMSCI (GRAph Made Statistics for Cosmological Information; bitbucket.org/csabiu/gramsci), under a GNU General Public License.

  • PDF

Representation and Implementation of Graph Algorithms based on Relational Database (관계형 데이타베이스에 기반한 그래프 알고리즘의 표현과 구현)

  • Park, Hyu-Chan
    • Journal of KIISE:Databases
    • /
    • v.29 no.5
    • /
    • pp.347-357
    • /
    • 2002
  • Graphs have provided a powerful methodology to solve a lot of real-world problems, and therefore there have been many proposals on the graph representations and algorithms. But, because most of them considered only memory-based graphs, there are still difficulties to apply them to large-scale problems. To cope with the difficulties, this paper proposes a graph representation and graph algorithms based on the well-developed relational database theory. Graphs are represented in the form of relations which can be visualized as relational tables. Each vertex and edge of a graph is represented as a tuple in the tables. Graph algorithms are also defined in terms of relational algebraic operations such as projection, selection, and join. They can be implemented with the database language such as SQL. We also developed a library of basic graph operations for the management of graphs and the development of graph applications. This database approach provides an efficient methodology to deal with very large- scale graphs, and the graph library supports the development of graph applications. Furthermore, it has many advantages such as the concurrent graph sharing among users by virtue of the capability of database.

User Interaction-based Graph Query Formulation and Processing (사용자 상호작용에 기반한 그래프질의 생성 및 처리)

  • Jung, Sung-Jae;Kim, Taehong;Lee, Seungwoo;Lee, Hwasik;Jung, Hanmin
    • Journal of KIISE:Databases
    • /
    • v.41 no.4
    • /
    • pp.242-248
    • /
    • 2014
  • With the rapidly growing amount of information represented in RDF format, efficient querying of RDF graph has become a fundamental challenge. SPARQL is one of the most widely used query languages for retrieving information from RDF dataset. SPARQL is not only simple in its syntax but also powerful in representation of graph pattern queries. However, users need to make a lot of efforts to understand the ontology schema of a dataset in order to compose a relevant SPARQL query. In this paper, we propose a graph query formulation and processing scheme based on ontology schema information which can be obtained by summarizing RDF graph. In the context of the proposed querying scheme, a user can interactively formulate the graph queries on the graphic user interface without making efforts to understand the ontology schema and even without learning SPARQL syntax. The graph query formulated by a user is transformed into a set of class paths, which are stored in a relational database and used as the constraint for search space reduction when the relational database executes the graph search operation. By executing the LUBM query 2, 8, and 9 over LUBM (10,0), it is shown that the proposed querying scheme returns the complete result set.

A Path Partitioning Technique for Indexing XML Data (XML 데이타 색인을 위한 경로 분할 기법)

  • 김종익;김형주
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.320-330
    • /
    • 2004
  • Query languages for XML use paths in a data graph to represent queries. Actually, paths in a data graph are used as a basic constructor of an XML query. User can write more expressive Queries by using Patterns (e.g. regular expressions) for paths. There are many identical paths in a data graph because of the feature of semi-structured data. Current researches for indexing XML utilize identical paths in a data graph, but such an index can grow larger than source data graph and cannot guarantee efficient access path. In this paper we propose a partitioning technique that can partition all the paths in a data graph. We develop an index graph that can find appropriate partitions for a path query efficiently. The size of our index graph can be adjusted regardless of the source data. So, we can significantly improve the cost for index graph traversals. In the performance study, we show our index much faster than other graph based indexes.

ValueRank: Keyword Search of Object Summaries Considering Values

  • Zhi, Cai;Xu, Lan;Xing, Su;Kun, Lang;Yang, Cao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.12
    • /
    • pp.5888-5903
    • /
    • 2019
  • The Relational ranking method applies authority-based ranking in relational dataset that can be modeled as graphs considering also their tuples' values. Authority directions from tuples that contain the given keywords and transfer to their corresponding neighboring nodes in accordance with their values and semantic connections. From our previous work, ObjectRank extends to ValueRank that also takes into account the value of tuples in authority transfer flows. In a maked difference from ObjectRank, which only considers authority flows through relationships, it is only valid in the bibliographic databases e.g. DBLP dataset, ValueRank facilitates the estimation of importance for any databases, e.g. trading databases, etc. A relational keyword search paradigm Object Summary (denote as OS) is proposed recently, given a set of keywords, a group of Object Summaries as its query result. An OS is a multilevel-tree data structure, in which node (namely the tuple with keywords) is OS's root node, and the surrounding nodes are the summary of all data on the graph. But, some of these trees have a very large in total number of tuples, size-l OSs are the OS snippets, have also been investigated using ValueRank.We evaluated the real bibliographical dataset and Microsoft business databases to verify of our proposed approach.

Hierarchical Structure in Semantic Networks of Japanese Word Associations

  • Miyake, Maki;Joyce, Terry;Jung, Jae-Young;Akama, Hiroyuki
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.321-329
    • /
    • 2007
  • This paper reports on the application of network analysis approaches to investigate the characteristics of graph representations of Japanese word associations. Two semantic networks are constructed from two separate Japanese word association databases. The basic statistical features of the networks indicate that they have scale-free and small-world properties and that they exhibit hierarchical organization. A graph clustering method is also applied to the networks with the objective of generating hierarchical structures within the semantic networks. The method is shown to be an efficient tool for analyzing large-scale structures within corpora. As a utilization of the network clustering results, we briefly introduce two web-based applications: the first is a search system that highlights various possible relations between words according to association type, while the second is to present the hierarchical architecture of a semantic network. The systems realize dynamic representations of network structures based on the relationships between words and concepts.

  • PDF

ShEx Schema Generator for RDF Graphs Created by Direct Mapping

  • Choi, Ji-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.10
    • /
    • pp.33-43
    • /
    • 2018
  • In this paper, we propose a method to automatically generate the description of an RDF graph structure. The description is expressed in Shape Expression Language (ShEx), which is developed by W3C and provides the syntax for describing the structure of RDF data. The RDF graphs to which this method can be applied are limited to those generated by the direct mapping, which is an algorithm for transforming relational data into RDF by W3C. A relational database consists of its schema including integrity constraints and its instance data. While the instance data can have been published in RDF by some standard methods such as the direct mapping, the translation of the schema has been missing so far. Unlike the users on relational databases, the ones on RDF datasets were forced to write repeated vague SPARQL queries over the datasets to acquire the exact results. This is because the schema for RDF data has not been provided to the users. The ShEx documents generated by our method can be referred as the schema on writing SPARQL queries. They also can validate data on RDF graph update operations with ShEx validators. In other words, they can work as the integrity constraints in relational databases.

A Syudy on the Biomedical Information Processing for Biomedicine and Healthcare (의료보건을 위한 의료정보처리에 관한 연구)

  • Jeong, Hyun-Cheol;Park, Byung-Jun;Bae, Sang-Hyun
    • Journal of Integrative Natural Science
    • /
    • v.2 no.4
    • /
    • pp.243-251
    • /
    • 2009
  • This paper surveys some researches to accomplish on bioinformatics. These researches wish to propose a database architecture combining a general view of bioinformatics data as a graph of data objects and data relationships, with the efficiency and robustness of data management and query provided by indexing and generic programming techniques. Here, these invert the role of the index, and make it a first-class citizen in the query language. It is possible to do this in a structured way, allowing users to mention indexes explicitly without yielding to a procedural query model, by converting functional relations into explicit functions. In the limit, the database becomes a graph, in which the edges are these indexes. Function composition can be specified either explicitly or implicitly as path queries. The net effect of the inversion is to convert the database into a hyperdatabase: a database of databases, connected by indexes or functions. The inversion approach was motivated by their work in biological databases, for which hyperdatabases are a good model. The need for a good model has slowed progress in bioinformatics.

  • PDF

Network Graph Analysis of Gene-Gene Interactions in Genome-Wide Association Study Data

  • Lee, Sungyoung;Kwon, Min-Seok;Park, Taesung
    • Genomics & Informatics
    • /
    • v.10 no.4
    • /
    • pp.256-262
    • /
    • 2012
  • Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs). For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR) is one of the powerful and efficient methods for detecting high-order gene-gene ($G{\times}G$) interactions. However, the biological interpretation of $G{\times}G$ interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified $G{\times}G$ interactions. The proposed network graph analysis consists of three steps. The first step is for performing $G{\times}G$ interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified $G{\times}G$ interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE) data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform $G{\times}G$ interaction analysis of body mass index (BMI). Our network graph analysis successfully showed that many identified $G{\times}G$ interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of $G{\times}G$ interactions.

A Study on Effective Real Estate Big Data Management Method Using Graph Database Model (그래프 데이터베이스 모델을 이용한 효율적인 부동산 빅데이터 관리 방안에 관한 연구)

  • Ju-Young, KIM;Hyun-Jung, KIM;Ki-Yun, YU
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.25 no.4
    • /
    • pp.163-180
    • /
    • 2022
  • Real estate data can be big data. Because the amount of real estate data is growing rapidly and real estate data interacts with various fields such as the economy, law, and crowd psychology, yet is structured with complex data layers. The existing Relational Database tends to show difficulty in handling various relationships for managing real estate big data, because it has a fixed schema and is only vertically extendable. In order to improve such limitations, this study constructs the real estate data in a Graph Database and verifies its usefulness. For the research method, we modeled various real estate data on MySQL, one of the most widely used Relational Databases, and Neo4j, one of the most widely used Graph Databases. Then, we collected real estate questions used in real life and selected 9 different questions to compare the query times on each Database. As a result, Neo4j showed constant performance even in queries with multiple JOIN statements with inferences to various relationships, whereas MySQL showed a rapid increase in its performance. According to this result, we have found out that a Graph Database such as Neo4j is more efficient for real estate big data with various relationships. We expect to use the real estate Graph Database in predicting real estate price factors and inquiring AI speakers for real estate.