DOI QR코드

DOI QR Code

Combining Local and Global Features to Reduce 2-Hop Label Size of Directed Acyclic Graphs

  • Ahn, Jinhyun (Dept. of Management Information Systems, Jeju National University) ;
  • Im, Dong-Hyuk (Dept. of Computer and Information Engineering, Hoseo University)
  • Received : 2018.06.18
  • Accepted : 2019.10.21
  • Published : 2020.02.29

Abstract

The graph data structure is popular because it can intuitively represent real-world knowledge. Graph databases have attracted attention in academia and industry because they can be used to maintain graph data and allow users to mine knowledge. Mining reachability relationships between two nodes in a graph, termed reachability query processing, is an important functionality of graph databases. Online traversals, such as the breadth-first and depth-first search, are inefficient in processing reachability queries when dealing with large-scale graphs. Labeling schemes have been proposed to overcome these disadvantages. The state-of-the-art is the 2-hop labeling scheme: each node has in and out labels containing reachable node IDs as integers. Unfortunately, existing 2-hop labeling schemes generate huge 2-hop label sizes because they only consider local features, such as degrees. In this paper, we propose a more efficient 2-hop label size reduction approach. We consider the topological sort index, which is a global feature. A linear combination is suggested for utilizing both local and global features. We conduct experiments over real-world and synthetic directed acyclic graph datasets and show that the proposed approach generates smaller labels than existing approaches.

Keywords

References

  1. R. Agrawal, A. Borgida, and H. V. Jagadish, "Efficient management of transitive relationships in large data and knowledge bases," ACM SIGMOD Record, vol. 18, no. 2, pp. 253-262, 1989. https://doi.org/10.1145/66926.66950
  2. H. Wei, J. X. Yu, C. Lu, and R. Jin, "Reachability querying: an independent permutation labeling approach," Proceedings of the VLDB Endowment, vol. 7, no. 12, pp. 1191-1202, 2014. https://doi.org/10.14778/2732977.2732992
  3. H. Gabr and T. Kahveci, "Signal reachability facilitates characterization of probabilistic signaling networks," BMC Bioinformatics, vol. 16, article no. S6, 2015.
  4. D. H. Im, S. W. Lee, and H. J. Kim, "A version management framework for RDF triple stores," International Journal of Software Engineering and Knowledge Engineering, vol. 22, no. 1, pp. 85-106, 2012. https://doi.org/10.1142/S0218194012500040
  5. S. M. Albladi and G. R. Weir, "User characteristics that influence judgment of social engineering attacks in social networks," Human-centric Computing and Information Sciences, vol. 8, article no. 5, 2018.
  6. K. Sriwanna, T. Boongoen, and N. Iam-On, "Graph clustering-based discretization of splitting and merging methods (GraphS and GraphM)," Human-centric Computing and Information Sciences, vol. 7), article no. 21, 2017.
  7. Y. Derdour, B. Kechar, and M. Faycal-Khelfi, "Using mobile data collectors to enhance energy efficiency and reliability in delay tolerant wireless sensor networks," Journal of Information Processing Systems, vol. 12, no. 2, pp. 275-294, 2016. https://doi.org/10.3745/JIPS.03.0032
  8. K. E. Moustafa and H. Hafid, "Self-Identification of Boundary's nodes in wireless sensor networks," Journal of Information Processing Systems, vol. 13, no. 1, pp. 128-140, 2017. https://doi.org/10.3745/JIPS.03.0062
  9. J. X. Yu and J. Cheng, "Graph reachability queries: a survey," in Managing and Mining Graph Data. Boston, MA: Springer, 2010, pp. 181-215.
  10. E. Cohen, E. Halperin, H. Kaplan, and U. Zwick, "Reachability and distance queries via 2-hop labels," SIAM Journal on Computing, vol. 32, no. 5, pp. 1338-1355. 2003. https://doi.org/10.1137/S0097539702403098
  11. J. Su, Q. Zhu, H. Wei, and J. X. Yu, "Reachability querying: Can it be even faster?" IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 3, pp. 683-697, 2016. https://doi.org/10.1109/TKDE.2016.2631160
  12. J. Ahn, "Optimization techniques for 2-hop labeling of dynamic directed acyclic graphs," in Proceedings of the Doctoral Consortium at the 15th International Semantic Web Conference (ISWC 2016), Kobe, Japan, 2016, pp. 1-8.
  13. A. D. Zhu, W. Lin, S. Wang, and X. Xiao, "Reachability queries on large dynamic graphs: a total order approach," in Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, Snowbird, UT, 2014, pp. 1323-1334.
  14. K. Simon, "An improved algorithm for transitive closure on acyclic digraphs," Theoretical Computer Science, vol. 58, no. 1-3, pp. 325-346, 1988. https://doi.org/10.1016/0304-3975(88)90032-1
  15. G. Wu, K. Zhang, C. Liu, and J. Li, "Adapting prime number labeling scheme for directed acyclic graphs," in Database Systems for Advanced Applications. Heidelberg: Springer, 2016, pp. 787-796.
  16. J. Ahn, T. Lee, and D. H. Im, "Efficiently answering reachability queries for tree-structured data in repetitive prime number labeling schemes," Applied Sciences, vol. 8, article no. 785, 2018.
  17. H. Yildirim, V. Chaoji, and M. J. Zaki, "GRAIL: a scalable index for reachability queries in very large graphs," The VLDB Journal, vol. 21, no. 4, pp. 509-534, 2012. https://doi.org/10.1007/s00778-011-0256-4
  18. S. Seufert, A. Anand, S. Bedathur, and G. Weikum, "Ferrari: flexible and efficient reachability range assignment for graph indexing," in Proceedings of 2013 IEEE 29th International Conference on Data Engineering (ICDE), Brisbane, Australia, 2013, pp. 1009-1020.