Design and Implementation of a Monitor for Hadoop Cluster

Hadoop 클러스터를 위한 모니터의 설계 및 구현

  • Keum, Tae-Hoon (LG Electronics, Mobile Communications Company) ;
  • Lee, Won-Joo (Department of Computer Science, Inha Technical College) ;
  • Jeon, Chang-Ho (Department of Computer Science & Engineering, Hanyang University ERICA Campus)
  • 금태훈 (LG전자 MC 사업본부) ;
  • 이원주 (인하공업전문대학 컴퓨터정보공학부 컴퓨터 정보과) ;
  • 전창호 (한양대학교 ERICA 캠퍼스 컴퓨터공학과)
  • Received : 2011.08.09
  • Accepted : 2011.12.30
  • Published : 2012.01.25

Abstract

In this paper, we propose a new monitor for collecting job information from Hadoop clusters in real time. This monitor is made of two programs called Collector and Agent. Agent collects Hadoop cluster's node information and job information, and Collector analyzes the collected information and saves it in a database. Also, Collector was placed in a new node outside the Hadoop cluster so that it does not affect Hadoop's work and will not cause overload. When the proposed monitor was implemented and applied, the testbed cluster was able to detect the occurrence of dead nodes immediately. In addition, we were able to find Hadoop jobs which were inefficient and when we modified such jobs to further enhance the performance of Hadoop.

본 논문에서는 Hadoop 클러스터의 노드 정보와 작업 정보를 실시간으로 수집할 수 있는 새로운 모니터를 제안한다. 이 모니터는 Hadoop클러스터의 노드 정보와 작업 정보를 수집하는 Agent, 수집된 정보를 분석하고 데이터베이스에 저장하는 Collector로 구성된다. 또한 Collector를 Hadoop 클러스터에 참여하지 않은 새로운 노드에 위치시킴으로써 분석과정에서 발생하는 오버헤드로 인한 Hadoop의 작업지연을 제거한다. 제안한 모니터를 구현하고 실험적 클러스터에 적용함으로써, dead 노드의 발생을 실시간으로 파악할 수 있었다. 또한, Hadoop의 작업수행 과정에서 비효율적인 과정을 발견하고 개선함으로써 작업수행시간을 단축시킬 수 있었다.

Keywords

References

  1. A. Kimball, S. Michels-Slettvet and C.Biscigilia, "Cluster Computing for Web-Scale Data Processing," Proceeding of the 39th SIGCSE technical symposium on Computer science education, Portland, Oregon, pp. 116-120, March 2008.
  2. Hadoop, http://hadoop.apache.org
  3. D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, "The Eucalyptus open-source cloud-computing system," Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid table of contents, pp. 124-131, 2009.
  4. Amazon Elastic Compute Cloud, http://aws.amazon.com/ec2
  5. J. Boulon, A. Konwinski, R. Qi, A. Rabkin, E. Yang and M. Yang, "Chukwa: A large-scale monitoring system," Proceeding of international conference on Cloud Computing and Its Applications, pp. 1-5, Oct. 2008.
  6. J. Tan, X. Pan, S. Kavulya, R. Gandhi and P. Narasimhan, "Mochi: Visualizing Log-Anlaysis Based Tools for Debugging Hadoop," In USENIX Workshop on Hot Topics in Cloud Computing(HotCloud), SanDiego, CA, Jun. 2009.
  7. S. Ghemawat, H. Gobioff, S.T. Leung, "The Google file system," ACM SIGOPS Operating Systems Review, Vol. 37, No. 5, pp. 29-43, Dec. 2003. https://doi.org/10.1145/1165389.945450
  8. Tae Hoon Keum, Won Joo Lee, Chang Ho Jeon, "A Performance Analysis Based on Hadoop Application's Characteristics in Cloud Computing," Journal of The Korea Society of Computer and Information, Vol. 15, No. 5, pp. 49-56, May 2010. https://doi.org/10.9708/jksci.2010.15.5.049