DOI QR코드

DOI QR Code

Performance Evaluation and Analysis of Multiple Scenarios of Big Data Stream Computing on Storm Platform

  • Sun, Dawei (School of Information Engineering, China University of Geosciences) ;
  • Yan, Hongbin (School of Information Engineering, China University of Geosciences) ;
  • Gao, Shang (School of Information Technology, Deakin University) ;
  • Zhou, Zhangbing (School of Information Engineering, China University of Geosciences)
  • Received : 2017.10.25
  • Accepted : 2018.01.22
  • Published : 2018.07.31

Abstract

In big data era, fresh data grows rapidly every day. More than 30,000 gigabytes of data are created every second and the rate is accelerating. Many organizations rely heavily on real time streaming, while big data stream computing helps them spot opportunities and risks from real time big data. Storm, one of the most common online stream computing platforms, has been used for big data stream computing, with response time ranging from milliseconds to sub-seconds. The performance of Storm plays a crucial role in different application scenarios, however, few studies were conducted to evaluate the performance of Storm. In this paper, we investigate the performance of Storm under different application scenarios. Our experimental results show that throughput and latency of Storm are greatly affected by the number of instances of each vertex in task topology, and the number of available resources in data center. The fault-tolerant mechanism of Storm works well in most big data stream computing environments. As a result, it is suggested that a dynamic topology, an elastic scheduling framework, and a memory based fault-tolerant mechanism are necessary for providing high throughput and low latency services on Storm platform.

Acknowledgement

Supported by : National Natural Science Foundation of China, Central Universities

References

  1. C. L. P. Chen, C. Y. Zhang, "Data-intensive applications, challenges, techniques and technologies: A survey on Big Data," Information Sciences, vol. 275, pp. 314-347, Aug. 2014. https://doi.org/10.1016/j.ins.2014.01.015
  2. D. W. Sun and H. Tang, "Fast-FFA: a fast online scheduling approach for big data stream computing with future features-aware," International Journal of Bio-Inspired Computation, vol. 10(3), pp. 205-217, Sep. 2017.
  3. A. Gani, A. Siddiqa, S. Shamshirband, F. Hanum, "A survey on indexing techniques for big data: taxonomy and performance evaluation," Knowledge and Information Systems, vol. 46(2, pp. 241-284), Feb. 2016. https://doi.org/10.1007/s10115-015-0830-y
  4. M. D. Assuncao, R. N. Calheiros, S. Bianchi, M. A. S. Netto, R. Buyya, "Big Data computing and clouds: Trends and future directions," Journal of Parallel and Distributed Computing, vol. 79-80, pp. 3-15, May 2015. https://doi.org/10.1016/j.jpdc.2014.08.003
  5. A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J. Donham, N. Bhagat, S. Mittal, D. Ryaboy, "Storm@twitter," in Proc. of 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014, ACM Press, pp. 147-156, Jun. 2014.
  6. C. Li, J. Zhang, Y. Luo, "Real-time scheduling based on optimized topology and communication traffic in distributed real-time computation platform of storm," Journal of Network and Computer Applications, vol. 87, pp. 100-115, Jun. 2017. https://doi.org/10.1016/j.jnca.2017.03.007
  7. T. Li, J. Tang, J. Xu, "Performance modeling and predictive scheduling for distributed stream data processing," IEEE Transactions on Big Data, vol. 2(4), pp. 353-364, Dec. 2016. https://doi.org/10.1109/TBDATA.2016.2616148
  8. Q. Cai, L. Ma, M. Gong and D. Tian, "A survey on network community detection based on evolutionary computation," International Journal of Bio-Inspired Computation, vol. 8(2), pp. 84-98, May 2016. https://doi.org/10.1504/IJBIC.2016.076329
  9. P. Novoa-Hernandez, C. C. Corona and D. A. Pelta, "Self-adaptation in dynamic environments - a survey and open issues," International Journal of Bio-Inspired Computation, vol. 8(1), pp. 1-13, Feb. 2016. https://doi.org/10.1504/IJBIC.2016.074635
  10. H. B. Yan, D. W. Sun, S. Gao, and Z. B. Zhou, "Performance Analysis of Storm in a Real-World Big Data Stream Computing Environment," in Proc. of 13th EAI International Conference on Collaborative Computing: Networking, Applications and Worksharing, CollaborateCom 2017, Springer Press, in press, December 2017.
  11. A. Mozaffari, N. L. Azad, "Empirical investigation and analysis of the computational potentials of bio-inspired nonlinear model predictive controllers: success and challenges," International Journal of Bio-Inspired Computation, 9(1): 19-34, 2017. https://doi.org/10.1504/IJBIC.2017.081857
  12. A. Ouannas, A. T. Azar, S. Vaidyanathan, "On a simple approach for Q-S synchronisation of chaotic dynamical systems in continuous-time," International Journal of Computing Science and Mathematics, 8(1):20-27, 2017. https://doi.org/10.1504/IJCSM.2017.083167
  13. M. Behroozifar, "Computational method for one-dimensional heat equation subject to non-local conditions," International Journal of Computing Science and Mathematics, 8(2):157-165, 2017. https://doi.org/10.1504/IJCSM.2017.083749
  14. Z. Cui, Y. Cao, X. Cai, J. Cai, J. Chen, "Optimal LEACH protocol with modified bat algorithm for big data sensing systems in Internet of Things," Journal of Parallel and Distributed Computing, 2017.
  15. X. Cai, H. Wang, Z. Cui, J. Cai, Y. Xue and L. Wang, "Bat Algorithm with Triangle-Flipping Strategy for Numerical Optimization," International Journal of Machine Learning and Cybernetics, 9(2):199-215, 2018. https://doi.org/10.1007/s13042-017-0739-8
  16. M. Zhang, H. Wang, Z. Cui and J. Chen, "Hybrid Multi-Objective Cuckoo Search with Dynamical Local Search," Memetic Computing, 2017.
  17. Z. Cui, B. Sun, G. Wang, Y. Xue, J. Chen, "A novel oriented cuckoo search algorithm to improve DV-Hop performance for cyber-physical systems," Journal of Parallel and Distributed Computing, 103:42-52, 2017. https://doi.org/10.1016/j.jpdc.2016.10.011
  18. R. Sivaraj, R. Devi Priya, "Bayesian-based parallel ant system for missing value estimation in large databases," International Journal of Bio-Inspired Computation, 9(2): 114-120, 2017. https://doi.org/10.1504/IJBIC.2017.083142
  19. X. Cai, X. Gao, Y. Xue, "Improved bat algorithm with optimal forage strategy and random disturbance strategy," International Journal of Bio-inspired Computation, 8(4):205-214, 2016. https://doi.org/10.1504/IJBIC.2016.078666
  20. P. Pongchairerks and V. Kachitvichyanukul "A two-level particle swarm optimisation algorithm for open-shop scheduling problem," International Journal of Computing Science and Mathematics, vol. 7(6), pp. 575-585, Dec. 2016. https://doi.org/10.1504/IJCSM.2016.081693
  21. G. Wang, X. Cai, Z. Cui, G. Min and J. Chen, "High Performance Computing for Cyber Physical Social Systems by Using Evolutionary Multi-Objective Optimization Algorithm," IEEE Transactions on Emerging Topics in Computing, 2017.
  22. G. Yang and X. Zhang, "Task allocation algorithm for virtual design organisation in agile industrial design," International Journal of Computing Science and Mathematics, vol. 8(3), pp. 249-256, Jul. 2017. https://doi.org/10.1504/IJCSM.2017.085727
  23. Storm,
  24. K. Kanoun, C. Tekin, D. Atienza, and M. Shaar, "Big-Data Streaming Applications Scheduling Based on Staged Multi-armed Bandits," IEEE Transactions on Computers, vol. 65(12), pp. 3591-3605, Dec. 2016. https://doi.org/10.1109/TC.2016.2550454
  25. V. Cardellini, M. Nardelli, and D. Luzi, "Elastic stateful stream processing in storm," in Proc. of 2016 International Conference on High Performance Computing & Simulation, HPCS 2016, IEEE Press, pp. 583-590, Jul. 2016.
  26. Y. Peng, Y. Han, Y. Xiao, S. Ullah, "Heuristics for single machine scheduling problem with family setup times," International Journal of Computing Science and Mathematics, 8(2):166-174, 2017. https://doi.org/10.1504/IJCSM.2017.083755
  27. Y. Gu, and C. Q. Wu, "Performance analysis and optimization of distributed workflows in heterogeneous network environments," IEEE Transactions on Computers, vol. 65(4), pp. 1266-1282, Apr. 2016. https://doi.org/10.1109/TC.2013.62
  28. Z. Wu and S. Wang, "The structural topology optimisation based on parameterised level-set method in isogeometric analysis," International Journal of Computing Science and Mathematics, vol. 8(4), pp. 353-363, Aug. 2017. https://doi.org/10.1504/IJCSM.2017.085861
  29. B. Lohrmann, P. Janacik, O. Kao, "Elastic Stream Processing with Latency Guarantees," in Proc. of 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015, IEEE Press, pp. 399-410, Jul. 2015.