Dynamic Load Management Method for Spatial Data Stream Processing on MapReduce Online Frameworks

맵리듀스 온라인 프레임워크에서 공간 데이터 스트림 처리를 위한 동적 부하 관리 기법

  • Jeong, Weonil (Division of Computer and Information Engineering, Hose University)
  • 정원일 (호서대학교 컴퓨터정보공학부)
  • Received : 2018.04.13
  • Accepted : 2018.08.03
  • Published : 2018.08.31


As the spread of mobile devices equipped with various sensors and high-quality wireless network communications functionsexpands, the amount of spatio-temporal data generated from mobile devices in various service fields is rapidly increasing. In conventional research into processing a large amount of real-time spatio-temporal streams, it is very difficult to apply a Hadoop-based spatial big data system, designed to be a batch processing platform, to a real-time service for spatio-temporal data streams. This paper extends the MapReduce online framework to support real-time query processing for continuous-input, spatio-temporal data streams, and proposes a load management method to distribute overloads for efficient query processing. The proposed scheme shows a dynamic load balancing method for the nodes based on the inflow rate and the load factor of the input data based on the space partition. Experiments show that it is possible to support efficient query processing by distributing the spatial data stream in the corresponding area to the shared resources when load management in a specific area is required.


Spatial Big Data;Spatial Data Stream Processing;Load Management;Load Balance;MapReduce Online


  1. J. Abdul, M. Alkathiri and M. B. Potdar, "Geospatial Hadoop (GS-Hadoop) an efficient mapreduce based engine for distributed processing of shapefiles", Advances in Computing, Communication, & Automation (ICACCA) (Fall), International Conference, pp. 1-7, 2016. DOI:
  2. J. M. Park, M. H. Lee, D. B. Shin and J. W. Ahn, "Deduction of the Policy Issues for Activating the Geo-Spatial Big Data Services", Journal of Korea Spatial Information Society, vol. 23, no. 6, pp. 19-29, 2015. DOI:
  3. A. Aji, H. Vo, W. Fusheng, R. Lee, X. Zhang and J. Saltz, "Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce", VLDB Endowment, vol. 6, no. 11, pp. 1009-1020, 2013. DOI:
  4. A. Eldawy, Alarabi, and M. F. Mokbel, "SpatialHaddop: A MapReduce Framework for Spatial Data", Data Engineering (ICDE), 2015 IEEE 31st International Conference on 2015, pp. 1352-1363, 2015. DOI:
  5. In-Hak Joo, "Spatial Big Data Query Processing System Supporting SQL-based Query Language in Hadoop", Journal of Korea institute of information, electronics, and communication technology vol. 10, no. 1, pp. 1-8, 2017.
  6. G. H. Kim, J. H. Yoon, C. M. Jun and H. C. Jung, "Providing Service Model Based on Concept and Requirements of Spatial Big Data", Journal of the Korean Society for Geospatial Information Science. vol. 24, no. 4, pp. 89-96, 2016. DOI:
  7. Apache Hadoop,
  8. J. Dean, S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters", Proc. of the 6th Symposium on Operating Systems Design and Implementation, pp. 137-150, 2004.
  9. A. M. Aly, A. Sallam, B. M. Gnanasekaran, L. Nguyen-Dinh, W. G. Aref, M. Ouzzani and A. Ghafoor, "M3: Stream Processing on Main-Memory MapReduce", Data Engineering (ICDE), IEEE 28th International Conference, pp. 1253-1256, 2012. DOI:
  10. D. Jeong, S. Jeon and B. Hong, "A Study on MapReduce Processing for Multi-dimensional Continuous Query", Lecture notes in computer science, vol. 7827, pp. 74-78, 2013. DOI:
  11. D. Yang, J. Cao, S. Wu and J. Wang, "Progressive online aggregation in a distributed stream system", The Journal of systems and software, vol. 102, pp. 146-157, 2015. DOI:
  12. X. Song, J. Gao, J. Ma, S. Niu and H. He, "HTME: A data streams processing strategy based on Hoeffding tree in MapReduce environment", Intelligent Control and Automation(WCICA), pp. 1042-1045, 2016.
  13. S. Park, W. Ryu, B. Hong and J Kwon, "MapReduce-based Stream Assigning and Splitting Technique for Stream Data Processing", Journal of KIISE, vol. 19, no. 8, pp. 439-443, 2013.
  14. K. Madsen and Y. Zhou, "Dynamic resource management in a MapReduce-style platform for fast data processing", Data Engineering Workshops(ICDEW), 31st IEEE International Conference, pp. 10-13, 2015. DOI:
  15. T. Condie, N. Conway, P. ALvaro and J.M. Hellerstein, "MapReduce Online", NSDI'10, 2010.
  16. S. Baek, D. Lee, G. Kim, W. Chung and H. Bae, "Load Shedding Method based on Grid Hash to Improve Accuracy of Spatial Sliding Window Aggregate Queries", Journal of KSIS, vol. 11, no. 2, pp. 89-98, 2009.
  17. H. Kim, S. Baek, D. Lee, G. Kim, H. Bae, "Pre-filtering based Post-Load Shedding Method for Improving Spatial Query Accuracy in GeoSensor Environment", Journal of KSISS, vol. 12, no. 1, pp. 18-27, 2010.
  18. W. Jeong, "Dynamic Load Shedding Scheme based on Input Rate of Spatial Data Stream and Data Density", Journal of KAIS, vol. 16, no. 3, pp. 2158-2164, 2015. DOI:
  19. R. A. Finkel and J. L. Bentley, "Quad trees a data structure for retrieval on composite keys", Acta informatica, vol. 4, no. 1, pp. 1-9, 1974. DOI: