DOI QR코드

DOI QR Code

A Study on Spatial Data Integration using Graph Database: Focusing on Real Estate

그래프 데이터베이스를 활용한 공간 데이터 통합 방안 연구: 부동산 분야를 중심으로

  • Ju-Young KIM (Department of Civil and Environmental Engineering, Seoul National University) ;
  • Seula PARK (Department of Civil and Environmental Engineering, Seoul National University) ;
  • Ki-Yun YU (Department of Civil and Environmental Engineering, Seoul National University)
  • 김주영 (서울대학교 건설환경공학부) ;
  • 박슬아 (서울대학교 건설환경공학부) ;
  • 유기윤 (서울대학교 건설환경공학부)
  • Received : 2023.06.27
  • Accepted : 2023.08.14
  • Published : 2023.09.30

Abstract

Graph databases, which store different types of data and their relationships modeled as a graph, can be effective in managing and analyzing real estate spatial data linked by complex relationships. However, they are not widely used due to the limited spatial functionalities of graph databases. In this study, we propose a uniform grid-based real estate spatial data management approach using a graph database to respond to various real estate-related spatial questions. By analyzing the real estate community to identify relevant data and utilizing national point numbers as unit grids, we construct a graph schema that linking diverse real estate data, and create a test database. After building a test database, we tested basic topological relationships and spatial functions using the Jackpine benchmark, and further conducted query tests based on various scenarios to verify the appropriateness of the proposed method. The results show that the proposed method successfully executed 25 out of 29 spatial topological relationships and spatial functions, and achieved about 97% accuracy for the 25 functions and 15 scenarios. The significance of this study lies in proposing an efficient data integration method that can respond to real estate-related spatial questions, considering the limited spatial operation capabilities of graph databases. However, there are limitations such as the creation of incorrect spatial topological relationships due to the use of grid-based indexes and inefficiency of queries due to list comparisons, which need to be improved in follow-up studies.

그래프 데이터베이스는 다양한 유형의 데이터와 그 관계를 그래프로 모델링하여 적재하기 때문에 복잡한 관계로 연결될 수 있는 부동산 데이터를 관리하고 분석하는데 효과적일 수 있으나, 현재 제공되는 그래프 데이터베이스의 제한적인 공간 기능으로 인해 활발히 활용되지 못하고 있다. 이러한 배경에서, 본 연구에서는 다양한 부동산 공간 관련 질문들에 대응할 수 있도록 그래프 데이터베이스를 활용한 Uniform Grid 기반 부동산 공간 데이터 관리 방안을 제안한다. 핵심 데이터를 선정하기 위하여 부동산 커뮤니티의 관련 질의를 분석하였으며, 국가지점번호를 단위 Grid로 설정하고 다양한 부동산 관련 데이터들을 연결한 그래프 스키마를 구성하여 테스트 데이터베이스를 구축하였다. 데이터베이스 검증을 위해, Jackpine 벤치마크를 활용하여 기본 위상관계 및 공간함수를 테스트하였고, 나아가 다양한 시나리오 기반 질의 테스트를 수행함으로써 제안한 방법의 적절성을 검증하고자 하였다. 그 결과, 제안한 방법은 총 29개의 공간 위상관계와 공간함수 중 25개의 기능을 성공적으로 수행하였고, 25개의 기능과 15개의 시나리오에 대해 약 97%의 정확도를 달성하였다. 본 연구는 그래프 데이터베이스의 제한적인 공간 기능을 고려하여, 부동산 관련 공간 질문에 대응할 수 있는 효율적인 데이터 통합방안을 제안하였다는 점에서 의의를 가진다. 그러나 그리드 기반 인덱스 사용으로 인한 잘못된 공간 위상관계 생성 문제 및 리스트 비교에 따른 질의의 비효율성에 대한 한계점이 존재하며, 이는 후속 연구에서 개선할 필요가 있다.

Keywords

Acknowledgement

본 연구는 국토교통부/국토교통과학기술진흥원의 지원으로 수행되었음(과제번호 RS-2022-00143336)

References

  1. Amiri, A. M., Samavati, F., and Peterson, P. 2015. Categorization and conversions for indexing methods of discrete global grid systems. ISPRS International Journal of Geo-Information 4(1):320-336. https://doi.org/10.3390/ijgi4010320
  2. Clementini, E., and Billen, R. 2006. Modeling and computing ternary projective relations between regions. IEEE Transactions on Knowledge and Data Engineering 18(6):799-814. https://doi.org/10.1109/TKDE.2006.102
  3. Ficklin, D. L., Letsinger, S. L., Gholizadeh, H., and Maxwell, J. T. 2015. Incorporation of the Penman-Monteith potential evapotranspiration method into a Palmer Drought Severity Index Tool. Computers and Geosciences 85:136-141. https://doi.org/10.1016/j.cageo.2015.09.013
  4. Guo, D., and Onstein, E. 2020. State-of-the-Art Geospatial Information Processing in NoSQL Databases. ISPRS International Journal of Geo-Information 9(5):331.
  5. Guting, R. H. 1994. An Introduction to Spatial Database Systems. The VLDB Journal 3(4):357-399. https://doi.org/10.1007/BF01231602
  6. Hein, N., and Blankenbach, J. 2021. Evaluation of a NoSQL Database for Storing Big Geospatial Raster Data. GI_Forum. Munster pp.76-84.
  7. Khan, S., and Kannapiran, T. 2019. Indexing Issues in Spatial Big Data Management. Khan, Shahnawaz and Kannapiran, Thirunavukkarasu, Indexing Issues in Spatial Big Data Management. International Conference on Advances in Engineering Science Management & Technology (ICAESMT). Dehradun.
  8. Kim, J.Y., Kim, H.J., Yu, K.Y. 2022. A Study on Effective Real Estate Big Data Management Method Using Graph Database Model. Journal of the Korean Association of Geographic Information Studies 25(4):163-180.
  9. Koppl, D. 2022. Inferring Spatial Distance Rankings with Partial Knowledge on Routing Networks. Information 13(4):168.
  10. Lee, K., Liu, L., Ganti, R. K., Srivatsa, M., Zhang, Q., Zhou, Y., and Wang, Q. 2016. Lightweight indexing and querying services for big spatial data. IEEE Transactions on Services Computing 12(3):343-355. https://doi.org/10.1109/TSC.2016.2637332
  11. Lee, K. S., and Jo, W. R. 2000. The Gradient Analysis of the Korean Peninsula by using DEM. Journal of the Korean Association of Geographic Information Studies 3(1):35-43.
  12. Li, S., Pu, G., Cheng, C., and Chen, B. 2019. Method for managing and querying geo-spatial data using a grid-code-array spatial index. Earth Science Informatics 12(2):173-181. https://doi.org/10.1007/s12145-018-0362-6
  13. Li, W., Wang, S., Wu, S., Gu, Z., and Tian, Y. 2022. Performance benchmark on semantic web repositories for spatially explicit knowledge graph applications. Computers, environment and urban systems 98: 101884.
  14. Liu, H., Jiang, G., Su, L., Cao, Y., Diao, F., and Mi, L. 2020, Construction of power projects knowledge graph based on graph database Neo4j. In 2020 International Conference on Computer, Information and Telecommunication Systems (CITS) pp.1-4.
  15. Liu, Z. H. Q. 2001. A Database Approach for Raster Data Management in Geographic Information System. The International Cartographic Conference (ICC). Beijing.
  16. Nyerges, T. 2021. Spatial Database Management Systems. John P. Wilson (ed.). The Geographic Information Science & Technology Body of Knowledge. UCGIS, Washington, D.C., USA.
  17. Oh, B.R. 2014. A Study on Travel Characteristics and the Establishment of Criterion for the Size of the Neighborhood Unit by Using the Data of Household Travel Diary Survey in Seoul. Seoul Studies 15(3):1-18.
  18. Pez, O., and Vilches-Blazquez, L. M. 2022. Bringing Federated Semantic Queries to the GIS-Based Scenario. ISPRS International Journal of Geo-Information 11(2):86.
  19. Ramiaramanana, H., Guilbert, E., and Moulin, B. 2022. A Cognitive Approach for Landsystem Identification using A Graph Database - Towards The Identification of Landforms In Context. Remote Sensing and Spatial Information Sciences 4:17-24. https://doi.org/10.5194/isprs-annals-V-4-2022-17-2022
  20. Rashidy, R. A. H. E., Hughes, P., Figueres-Esteban, M., Harrison, C., and Van Gulijk, C. 2018. A big data modeling approach with graph databases for SPAD risk. Safety science 110:75-79. https://doi.org/10.1016/j.ssci.2017.11.019
  21. Ray, S., Simion, B., and Demke Brown, A. 2011. Jackpine: A benchmark to evaluate spatial database performance. IEEE 27th International Conference on Data Engineering. Hannover pp.1139-1150.
  22. Samet, H. 1995. Spatial Data Structures. ACM Press and Addison-Wesley 361-385.
  23. Schmid, S., Galicz, E., and Reinhardt, W. 2015. Performance investigation of selected SQL and NoSQL databases. In Proceedings of the AGILE pp.1-5.
  24. Soni, M., and Wade, V. 2023. Comparing Abstractive Summaries Generated by ChatGPT to Real Summaries Through Blinded Reviewers and Text Classification Algorithms. arXiv preprint arXiv:2303. 17650.
  25. Ullah, F., Sepasgozar, S., and Wang, C. 2018. A Systematic Review of Smart Real Estate Technology: Drivers of, and Barriers to, the Use of Digital Disruptive Technologies and Online Platforms. Sustainability 10(9):3142.
  26. Wang, D., and Li, V. J. 2019. Mass Appraisal Models of Real Estate in the 21st Century: A Systematic Literature Review. Sustainability 11(24):7006.
  27. Xiao, F., Guo, W., Liu, W., and Zeng, J. 2021. A Spatio-temporal Big Data Decision Support System of Real Estate. International Conference on Information Technology and Biomedical Engineering (ICITBE) IEEE.. December pp.30-34.
  28. Yang, S.C. 2013. A Study on the Introduction of the National Point Number for Advanced Location-Finding. Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography 31(2):151-157. https://doi.org/10.7848/ksgpc.2013.31.2.151
  29. Yao, X., and Li, G. 2018. Big spatial vector data management: a review. Big Earth Data 2(1):108-129. https://doi.org/10.1080/20964471.2018.1432115
  30. Yoon, B.H., Kim, S.K., and Kim, S.Y. 2017. Use of graph database for the integration of heterogeneous biological data. Genomics & informatics 15(1):19-27. https://doi.org/10.5808/GI.2017.15.1.19
  31. Zhang, Y. 2016. The D-FCM partitioned D-BSP tree for massive point cloud data access and rendering. ISPRS Journal of Photogrammetry and Remote Sensing 120:25-36. https://doi.org/10.1016/j.isprsjprs.2016.08.002
  32. Zhu, J., Chong, H.-Y., Zhao, H., Wu, J., Tan, Y., and Xu, H. 2022. The Application of Graph in BIM/GIS Integration. Buildings 12(12):2162.