초록
To identify the cause of the error and maintain the health of system, an administrator usually analyzes event log data since it contains useful information to infer the cause of the error. However, because today's systems are huge and complex, it is almost impossible for administrators to manually analyze event log files to identify the cause of an error. In particular, as OpenStack, which is being widely used as cloud management system, operates with various service modules being linked to multiple servers, it is hard to access each node and analyze event log messages for each service module in the case of an error. For this, in this paper, we propose a novel message-based log analysis method that enables the administrator to find the cause of an error quickly. Specifically, the proposed method 1) consolidates event log data generated from system level and application service level, 2) clusters the consolidated data based on messages, and 3) analyzes interrelations among message groups in order to promptly identify the cause of a system error. This study has great significance in the following three aspects. First, the root cause of the error can be identified by collecting event logs of both system level and application service level and analyzing interrelations among the logs. Second, administrators do not need to classify messages for training since unsupervised learning of event log messages is applied. Third, using Dynamic Time Warping, an algorithm for measuring similarity of dynamic patterns over time increases accuracy of analysis on patterns generated from distributed system in which time synchronization is not exactly consistent.