DOI QR코드

DOI QR Code

Application Of Open Data Framework For Real-Time Data Processing

실시간 데이터 처리를 위한 개방형 데이터 프레임워크 적용 방안

  • Received : 2019.08.25
  • Accepted : 2019.09.25
  • Published : 2019.10.31

Abstract

In today's technology environment, most big data-based applications and solutions are based on real-time processing of streaming data. Real-time processing and analysis of big data streams plays an important role in the development of big data-based applications and solutions. In particular, in the maritime data processing environment, the necessity of developing a technology capable of rapidly processing and analyzing a large amount of real-time data due to the explosion of data is accelerating. Therefore, this paper analyzes the characteristics of NiFi, Kafka, and Druid as suitable open source among various open data technologies for processing big data, and provides the latest information on external linkage necessary for maritime service analysis in Korean e-Navigation service. To this end, we will lay the foundation for applying open data framework technology for real-time data processing.

오늘날의 기술 환경에서 대다수의 빅 데이터 기반 애플리케이션 및 솔루션은 스트리밍 데이터의 실시간 처리를 기반으로 한다. 빅 데이터 스트림의 실시간 처리 및 분석은 빅 데이터 기반 애플리케이션 및 솔루션 개발에서 중요한 역할을 한다. 특히 해사 분야 데이터 처리 환경에서도 데이터의 폭발적 증대에 따른 대용량 실시간 데이터를 빠르게 처리 및 분석할 수 있는 기술 개발의 필요성이 가속화되고 있다. 따라서 본 논문에서는 다양한 빅 데이터 처리를 위한 오픈소스 기술 중에 적합한 오픈소스로 NiFi, Kafka, Druid의 특징을 분석하여 한국형 e-Navigation 서비스에서 해사 분야 서비스 분석에 필요한 외부 연계 필요 정보들을 상시 최신 정보로 제공할 수 있도록 실시간 데이터 처리를 위한 개방형 데이터 프레임워크 기술 적용의 기초를 마련하고자 한다.

Keywords

Acknowledgement

This research is a part of the project titled "SMART-Navigation project," funded by the Ministry of Oceans and Fisheries.

References

  1. F. Gurcan, and M. Berigel, "Real-Time Processing of Big Data Stream: Lifecycle, Tools, Tasks, and Challengs," 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Turkey, pp. 1-6, 2018.
  2. H. S. Jung, C. S. Yoon, and Y. W. Lee, "Cloud computing platform based real-time processing for stream reasoning," Sixth International Conference on Future Generation Communication Technologies (FGCT), Doublin, pp. 1-5, 2017.
  3. How to analyze real-time big-data [Internet]. Available: https://d2.naver.com/helloworld/694050.
  4. Real-time data feed processing and Apache Kafka for it [Internet]. Available: https://m.blog.naver.com/sundooedu/21230385470.
  5. K. S. Paik, "Multi-channel data connection and Real-time processing system desinged for Big Data collection," The Korea Contents Society, 2016.
  6. S. Han, H. Chung, D. Ok, and Y. N, "Impact Analysis of Data Volume on Spark Performance," The Korean Institute of Information Scientists and Engineers, 2016.
  7. Apache NiFi [Internet]. Available: https://en.wikipedia.org/wiki/Apache_NiFi.
  8. The Core concepts of NiFi [Internet]. Available: https://nifi.apache.org/docs.html.
  9. Kafka Introduction [Internet]. Available: https://kafka.apache.org/intro.
  10. What is Druid [Internet]. Available: https://druid.apache.org/docs/latest/design.
  11. Druid Coordinator Process [Internet]. Available: https://druid.apache.org/docs/latest/design/coordinator.html.
  12. Druid Overlord Process [Internet]. Available: https://druid.apache.org/docs/latest/design/overload.html.
  13. Druid Broker [Internet]. Available: https://druid.apache.org/docs/latest/design/broker.html.
  14. Druid Historical Process [Internet]. Available: https://druid.apache.org/docs/latest/design/historical.html.
  15. Druid MiddleManager Process [Internet]. Available: https://druid.apache.org/docs/latest/design/middlemanager.html.
  16. Druid Zookeeper [Internet]. Available: https://druid.apache.org/ocs/latest/dependencies/zookeeper.html.
  17. Druid Deep Storage [Internet]. Available: https://druid.pache.org/docs/latest/dependencies/deep-storage.html.
  18. Druid Metadata Storage [Internet]. Available: https://druid.pache.org/docs/latest/dependencies/metadata-storage.html.
  19. H. Isah, and F. Zulkernine, "A Scalable and Roburst Framework for Data Stream Ingestion," Cornell University, arXiv: 812.04197, 2018.
  20. F. Yang, (2016, June). Building a Streaming Analytics Stack with ApacheKafka and Druid. Confluent [Online]. Available:https://www.confluent.io/blog/building-a-streaming-analytics-stack-with-apache-kafka-and-druid.
  21. F. Yang, E. Tschetter, and X. Leaute, "Druid-A Real time Analytical Data Store," 2014.