DOI QR코드

DOI QR Code

Implementation and Performance Aanalysis of Efficient Big Data Processing System Through Dynamic Configuration of Edge Server Computing and Storage Modules

BigCrawler: 엣지 서버 컴퓨팅·스토리지 모듈의 동적 구성을 통한 효율적인 빅데이터 처리 시스템 구현 및 성능 분석

  • Received : 2021.10.22
  • Accepted : 2021.12.09
  • Published : 2021.12.31

Abstract

Edge Computing enables real-time big data processing by performing computing close to the physical location of the user or data source. However, in an edge computing environment, various situations that affect big data processing performance may occur depending on temporary service requirements or changes of physical resources in the field. In this paper, we proposed a BigCrawler system that dynamically configures the computing module and storage module according to the big data collection status and computing resource usage status in the edge computing environment. And the feature of big data processing workload according to the arrangement of computing module and storage module were analyzed.

Keywords

Acknowledgement

이 논문은 2021년도 정부 (과학기술정보통신부)의 재원으로 정보통신기획평가원의 지원을 받아 수행된 연구임 (No. 2020-0-00844, 엣지 서버 시스템 자원 관리 및 제어를 위한 경량 시스템 소프트웨어 기술 개발).

References

  1. J. Wang, W. Zhang, Y. Shi, S. Duan, J. Liu, "Industrial Big Data Analytics: Challenges, Methodologies, and Applications", CoRR, arXiv:1807.01016, 2018
  2. B. Duncan, M. Whittington, V. Chang, "Enterprise security and privacy: Why adding IoT and big data makes it so much more difficult," International Conference on Engineering and Technology (ICET), 2017, pp. 1-7
  3. M. Caprolu, R. Di Pietro, F. Lombardi and S. Raponi, "Edge Computing Perspectives: Architectures, Technologies, and Open Security Issues," 2019 IEEE International Conference on Edge Computing (EDGE), 2019, pp. 116-123
  4. "Technology Roadmap of SME", 2018-2020, https://www.smtech.go.kr
  5. Gigabyte, "Edge Server", 2021
  6. Dell, "Dell EMC PowerEdge XE Servers", 2021
  7. D. Kim, Y. Park, T. Chung, "Development of Big-data Management Platform Considering Docker Based Real Time Data Connecting and Processing Environments," IEMEK J. Embed. Sys. Appl., Vol. 16, No. 4, pp. 153-161, June, 2021 (in Korean). https://doi.org/10.14372/IEMEK.2021.16.4.153
  8. T. Kim, T, Kim, S. Jin, "Multi-access Edge Computing Scheduler for Low Latency Services," IEMEK J. Embed. Sys. Appl., Vol. 15, No. 6, pp. 299-305, December, 2020 (in Korean). https://doi.org/10.14372/IEMEK.2020.15.6.299
  9. S. J. Shin, J. Woo, W. Seo, "Developing a Big Data Analytics Platform Architecture for Smart Factory," Journal of Korea Multimedia Society, Vol. 19, No. 8, pp. 1516-1529, Aug. 2016 (in Korean). https://doi.org/10.9717/KMMS.2016.19.8.1516
  10. Kubernetes, https://kubernetes.io/
  11. Apache Lucene, https://lucene.apache.org/
  12. Elastic(ELK) Stack, https://www.elastic.co/
  13. J. William, "Web Data Crawling vs Web Data Scraping", Promptcloud, https://www.promptcloud.com/blog/data-scraping-vs-data-crawling.
  14. Prometheus, https://prometheus.io/
  15. Analytics end-to-end with Azure Synapse, https://docs.microsoft.com/en-us/azure/architecture/example-scenario/dataplate2e/data-platform-end-to-end
  16. AWS serverless data analytics pipeline reference architecture, https://aws.amazon.com/ko/blogs/big-data/aws-serverless-data-analytics-pipeline-reference-architecture/
  17. "Power facility energy pattern and failure analysis sensor," https://aihub.or.kr/aidata/30759 (in Korean).