• Title/Summary/Keyword: Reuse JVM

Search Result 1, Processing Time 0.023 seconds

A Study on the Improving Performance of Massively Small File Using the Reuse JVM in MapReduce (MapReduce에서 Reuse JVM을 이용한 대규모 스몰파일 처리성능 향상 방법에 관한 연구)

  • Choi, Chul Woong;Kim, Jeong In;Kim, Pan Koo
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.9
    • /
    • pp.1098-1104
    • /
    • 2015
  • With the widespread use of smartphones and IoT (Internet of Things), data are being generated on a large scale, and there is increased for the analysis of such data. Hence, distributed processing systems have gained much attention. Hadoop, which is a distributed processing system, saves the metadata of stored files in name nodes; in this case, the main problems are as follows: the memory becomes insufficient; load occurs because of massive small files; scheduling and file processing time increases because of the increased number of small files. In this paper, we propose a solution to address the increase in processing time because of massive small files, and thus improve the processing performance, using the Reuse JVM method provided by Hadoop. Through environment setting, the Reuse JVM method modifies the JVM produced conventionally for every task, so that multiple tasks are reused sequentially in one JVM. As a final outcome, the Reuse JVM method showed the best processing performance when used together with CombineFileInputFormat.