DOI QR코드

DOI QR Code

An Adaptively Speculative Execution Strategy Based on Real-Time Resource Awareness in a Multi-Job Heterogeneous Environment

  • Liu, Qi (Nanjing University of Information Science and Technology) ;
  • Cai, Weidong (Nanjing University of Information Science and Technology) ;
  • Liu, Qiang (School of Computer, Hunan University of Technology) ;
  • Shen, Jian (Nanjing University of Information Science and Technology) ;
  • Fu, Zhangjie (Nanjing University of Information Science and Technology) ;
  • Liu, Xiaodong (School of Computing, Edinburgh Napier University) ;
  • Linge, Nigel (School of Computing, Science and Engineering, The University of Salford)
  • Received : 2016.06.18
  • Accepted : 2016.12.23
  • Published : 2017.02.28

Abstract

MapReduce (MRV1), a popular programming model, proposed by Google, has been well used to process large datasets in Hadoop, an open source cloud platform. Its new version MapReduce 2.0 (MRV2) developed along with the emerging of Yarn has achieved obvious improvement over MRV1. However, MRV2 suffers from long finishing time on certain types of jobs. Speculative Execution (SE) has been presented as an approach to the problem above by backing up those delayed jobs from low-performance machines to higher ones. In this paper, an adaptive SE strategy (ASE) is presented in Hadoop-2.6.0. Experiment results have depicted that the ASE duplicates tasks according to real-time resources usage among work nodes in a cloud. In addition, the performance of MRV2 is largely improved using the ASE strategy on job execution time and resource consumption, whether in a multi-job environment.

Keywords

References

  1. Z. Fu, X. Sun, Q. Liu, L. Zhou and J. Shu, "Achieving Efficient Cloud Search Services: Multi-keyword Ranked Search over Encrypted Cloud Data Supporting Parallel Computing," IEICE Transactions on Communications, vol. E98B, no. 1, pp. 190-200, 2015.
  2. Y. Kong, M. Zhang, and D. Ye, "A belief propagation-based method for task allocation in open and dynamic cloud environments," Knowledge-Based ystems, vol. 115, pp. 123-132, 2017. https://doi.org/10.1016/j.knosys.2016.10.016
  3. M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski and M. Zaharia, "A view of cloud computing," Communications of the ACM, vol. 53, no. 4, pp. 50-58, 2010. https://doi.org/10.1145/1721654.1721672
  4. J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Communications of the ACM, vol. 51, no. 1, pp. 107-113, 2008. https://doi.org/10.1145/1327452.1327492
  5. K. Anyanwu, H. S. Kim, P. Ravindra, "Algebraic Optimization for Processing Graph Pattern Queries in the Cloud," IEEE Internet Computing, vol. 99, no. 2, pp. 52-61, 2013.
  6. C. Olston, B. Reed, U. Srivastava, R. Kumar and A. Tomkins, "Pig Latin: A Not-So-Foreign Language for Data Processing," in Proc. of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1099-1110, 2008.
  7. A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu and R. Murthy, "Hive-a petabyte scale data warehouse using Hadoop," in Proc. of 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 996-1005, 2010.
  8. P. J. Tai and J. Yan, "Computing resource prediction for mapreduce applications using decision tree," Web Technologies and Applications, pp. 570-577, 2012.
  9. W. Fang, B. He, Q. Luo and N. K. Govindaraju, "Mars: accelerating mapreduce with graphics processors," IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 4, pp. 608-620, 2010. https://doi.org/10.1109/TPDS.2010.158
  10. Y. Zhang, Q. Gao, L. Gao and C. Wang, "Priter: a distributed framework for prioritizing iterative computations," IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 9, pp. 1884-1893, 2014. https://doi.org/10.1109/TPDS.2012.272
  11. B. Palanisamy, A. Singh and L. Liu, "Cost-effective resource provisioning for mapreduce in a cloud," IEEE Transactions on Parallel and Distributed Systems, vol. 26, no. 5, pp. 1265-1279, 2015. https://doi.org/10.1109/TPDS.2014.2320498
  12. Y. Kwon, M. Balazinska, B. Howe and J. Rolia, "Skewtune: mitigating skew in mapreduce applications," in Proc. of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 25-36, 2012.
  13. B. Gufler, N. Augsten, A. Reiser and A. Kemper, "Handling data skew in MapReduce," in Proc. of the 1st International Conference on Cloud Computing and Services Science (CLOSER), pp. 574-583, 2011.
  14. Y. Fan, W. Wu, Y. Xu and H. Chen, "Improving MapReduce Performance by Balancing Skewed Loads," Communications, China, vol. 11, no. 8, pp. 85-108, 2014.
  15. Q. Liu, W. Cai, J. Shen, Z. Fu, X. Liu, and N. Linge, "A speculative approach to spatial-temporal efficiency with multi-objective optimization in a heterogeneous cloud environment," Security and Communication Networks, preprint, 2016.
  16. T. Wood, L. Cherkasova, K. Ozonat, and P. Shenoy, "Profiling and modeling resource usage of virtualized applications," in Proc. of the 9th ACM/IFIP/USENIX International Conference on Middleware, pp. 366-387, 2008.
  17. S. Islam, J. Keung, K. Lee and A. Liu, "Empirical prediction models for adaptive resource provisioning in the cloud," Future Generation Computer Systems, vol. 28, no. 1, pp. 155-162, 2012. https://doi.org/10.1016/j.future.2011.05.027
  18. A. Matsunaga and J. Fortes, "On the use of machine learning to predict the time and resources consumed by applications," in Proc. of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 495-504, 2010.
  19. Y. Wang and W. Shi, "Budget-Driven Scheduling Algorithms for Batches of MapReduce Jobs in Heterogeneous Clouds," IEEE Transactions on Cloud Computing, vol. 2, no. 3, pp. 306-319, 2014. https://doi.org/10.1109/TCC.2014.2316812
  20. W. Yu, Y. Wang, X. Que and C. Xu, "Virtual shuffling for efficient data movement in mapreduce," IEEE Transactions on Computers, vol. 6, no. 1, pp. 556-568, 2015.
  21. S. Tang, B. S. Lee and B. He, "DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters," IEEE Transactions on Cloud Computing, vol. 2, no. 3, pp. 333-347, 2014. https://doi.org/10.1109/TCC.2014.2329299
  22. M. Zaharia, A. Konwinski, A. Joseph, R. Katz and I. Stoica, "Improving Mapreduce Performance in Heterogeneous Environments," OSDI, vol. 8, no. 4, 2008.
  23. C. Qi, C. Liu and Z. Xiao, "Improving MapReduce performance using smart speculative execution strategy, " IEEE Transactions on Computers, vol. 63, no. 4, pp. 954-967, 2014. https://doi.org/10.1109/TC.2013.15
  24. Q. Liu, W. Cai, D. Jin, J. Shen, F. Zhang, X. Liu, and N. Linge, "Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment," Sensors, vol. 16, no. 9, pp. 1-15, 2016. https://doi.org/10.1109/JSEN.2016.2616227
  25. Q. Liu, W. Cai, J. Shen, X. Liu, and N. Linge, "An Adaptive Approach to Better Load Balancing in a Consumer-centric Cloud Environment," IEEE Transaction on Consumer Electronics, vol. 62, no. 3, pp. 243-250, 2016. https://doi.org/10.1109/TCE.2016.7613190
  26. F. Ahmad, S. Chakradhar, A. Raghunathan and T. Vijaykumar, "Tarazu: optimizing MapReduce on heterogeneous clusters," ACM SIGARCH Computer Architecture News, pp. 61-74, 2012.
  27. M. Dai, Z. Lu, D. Shen, H. Wang, B. Chen, X. Lin, S. Zhang, L. Zhang, and H. Liu, "Design of (4, 8) binary code with MDS and zigzag-decodable property," Wireless Personal Communications, vol. 89, no. 1, pp. 1-13, Jul. 2016. https://doi.org/10.1007/s11277-016-3234-8
  28. M. Dai, C. W. Sung, H. Wang, X. Gong, and Z. Lu, "A new zigzag-decodable code with efficient repair in wireless distributed storage," IEEE Transaction on Mobile Computing, preprint, 2016.