Load Balancing Scheme for Machine Learning Distributed Environment

기계학습 분산 환경을 위한 부하 분산 기법

  • 김영관 (숭실대학교 일반대학원 컴퓨터학과) ;
  • 이주석 (숭실대학교 컴퓨터학과) ;
  • 김아정 (숭실대학교 컴퓨터학과) ;
  • 홍지만 (숭실대학교 컴퓨터학부)
  • Received : 2020.12.08
  • Accepted : 2021.03.08
  • Published : 2021.03.31

Abstract

As the machine learning becomes more common, development of application using machine learning is actively increasing. In addition, research on machine learning platform to support development of application is also increasing. However, despite the increasing of research on machine learning platform, research on suitable load balancing for machine learning platform is insufficient. Therefore, in this paper, we propose a load balancing scheme that can be applied to machine learning distributed environment. The proposed scheme composes distributed servers in a level hash table structure and assigns machine learning task to the server in consideration of the performance of each server. We implemented distributed servers and experimented, and compared the performance with the existing hashing scheme. Compared with the existing hashing scheme, the proposed scheme showed an average 26% speed improvement, and more than 38% reduced the number of waiting tasks to assign to the server.

기계학습이 보편화되면서 기계학습을 활용한 응용 개발 또한 활발하게 이루어지고 있다. 또한 이러한 응용 개발을 지원하기 위한 기계학습 플랫폼 연구도 활발하게 진행되고 있다. 그러나 기계학습 플랫폼 연구가 활발하게 진행되고 있음에도 불구하고 기계학습 플랫폼에 적절한 부하 분산에 관한 연구는 아직 부족하다. 따라서 본 논문에서는 기계학습 분산 환경을 위한 부하 분산 기법을 제안한다. 제안하는 기법은 분산 서버를 레벨 해시 테이블 구조로 구성하고 각 서버의 성능을 고려하여 기계학습 작업을 서버에 할당한다. 이후 분산 서버를 구현하여 실험하고 기존 해싱 기법과 성능을 비교하였다. 제안하는 기법을 기존 해싱 기법과 비교하였을 때 평균 약 26%의 속도 향상을 보였고, 서버에 할당되지 못하고 대기하는 작업의 수가 약 38% 이상 감소함을 보였다.

Keywords

References

  1. G. Nguyen, S. Dlugolinsky, M. Bob'ak, V. Tran, 'A. L'. Garc'ia, I. Heredia, P. Mal'ik, and L. Hluch'y, "Machine Learning and Deep Learning Frameworks and Libraries for Large-scale Data Mining: a Survey," Artificial Intelligence Review, pp. 1-48, 2019.
  2. J. Schmidt, M. R. G. Marques, S. Botti, and M. A. L. Marques. "Recent advances and applications of machine learning in solid-state materials science," npj Computational Materials, vol. 5, no. 1, 83, 2019. https://doi.org/10.1038/s41524-019-0221-0
  3. Dong Ju Park, Byeong Woo Kim, Young-Seon Jeong, Chang Wook Ahn, "Deep Neural Network Based Prediction of Daily Spectators for Korean Baseball League : Focused on Gwangju-KIA Champions Field," Smart Media Journal, vol. 7, no. 1, pp. 16-23, 2018. https://doi.org/10.30693/SMJ.2018.7.1.16
  4. Sun Park, Jongwon Kim, "Red Tide Algea Image Classification using Deep Learning based Open Source," Smart Media Journal, vol. 7, no. 2, pp. 34-39, 2018. https://doi.org/10.30693/SMJ.2018.7.2.34
  5. Seo jeong Kim, Jae Su Lee, Hyong Suk Kim, "Deep learning-based Automatic Weed Detection on Onion Field," Smart Media Journal, vol. 7, no. 3, pp. 16-21, 2018. https://doi.org/10.30693/SMJ.2018.7.3.16
  6. H. Kim, Y. Kim, and J. Hong, "Cluster Management Framework for Autonomic Machine Learning Platform," In Proceedings of the Conference on Research in Adaptive and Convergent Systems (RACS '19), pp. 128-130, Chongqing, China, 2019.
  7. K. M. Lee, J. Yoo, S. W. Kim, J. H. Lee, and J. Hong, "Autonomic Machine Learning Platform," International Journal of Information Management, vol. 49, pp. 491-501, 2019. https://doi.org/10.1016/j.ijinfomgt.2019.07.003
  8. D. M. Dias, W. Kish, R. Mukherjee, and R. Tewari, "A Scalable and Highly Available Web Server," In COMPCON '96. Technologies for the Information Super-highway Digest of Papers, pp. 85-92, Santa Clara, CA, USA, Feb. 1996.
  9. D. Kashyap and J. Viradiya, "A Survey Of Various Load Balancing Algorithms In Cloud Computing," International Journal of Scientific & Technology Research, vol. 3, pp. 115-119, 2014.
  10. J. Y. Jo and Y. Kim, "Hash-based Internet Traffic Load Balancing," In Proceedings of the 2004 IEEE International Conference on Information Reuse and Integration, pp. 204-209, Las Vegas, USA, Nov. 2004.
  11. X. Zhu, Q. Zhang, L. Liu, T. Cheng, S. Yao, W. Zhou, and J. He. "DLB: Deep Learning Based Load Balancing," 2019, arXiv:cs.DC/1910.08494.
  12. C. S. Lin, C. W. Hsieh, H. Y. Chang, and P.-A. Hsiung, "Efficient Workload Balancing on Heterogeneous GPUs using Mixed-Integer Non-Linear Programming," Journal of Applied Research and Technology, vol. 12, pp. 1176-1186, 2014. https://doi.org/10.1016/S1665-6423(14)71676-1
  13. Y. Khalid, M. Aleem, R. Prodan, M. Iqbal, and A. Islam, "E-OSched: A Load Balancing Scheduler for Heterogeneous Multicores," The Journal of Supercomputing, vol. 74, pp. 5399-5431, 2018. https://doi.org/10.1007/s11227-018-2435-1
  14. P. Zuo, Y. Hua, and J. Wu, "Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory," In Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation (OSDI'18), USENIX Association, pp. 461-476, 2018.
  15. R. Pagh and F. F. Rodler, "Cuckoo Hashing," J. Algorithms, vol. 51, no. 2, pp. 122-144, 2004. https://doi.org/10.1016/j.jalgor.2003.12.002