• Title/Summary/Keyword: Log processing

Search Result 560, Processing Time 0.026 seconds

An Efficient Design and Implementation of an MdbULPS in a Cloud-Computing Environment

  • Kim, Myoungjin;Cui, Yun;Lee, Hanku
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.8
    • /
    • pp.3182-3202
    • /
    • 2015
  • Flexibly expanding the storage capacity required to process a large amount of rapidly increasing unstructured log data is difficult in a conventional computing environment. In addition, implementing a log processing system providing features that categorize and analyze unstructured log data is extremely difficult. To overcome such limitations, we propose and design a MongoDB-based unstructured log processing system (MdbULPS) for collecting, categorizing, and analyzing log data generated from banks. The proposed system includes a Hadoop-based analysis module for reliable parallel-distributed processing of massive log data. Furthermore, because the Hadoop distributed file system (HDFS) stores data by generating replicas of collected log data in block units, the proposed system offers automatic system recovery against system failures and data loss. Finally, by establishing a distributed database using the NoSQL-based MongoDB, the proposed system provides methods of effectively processing unstructured log data. To evaluate the proposed system, we conducted three different performance tests on a local test bed including twelve nodes: comparing our system with a MySQL-based approach, comparing it with an Hbase-based approach, and changing the chunk size option. From the experiments, we found that our system showed better performance in processing unstructured log data.

Messaging System Analysis for Effective Embedded Tester Log Processing (효과적인 Embedded Tester Log 처리를 위한 Messaging System 분석)

  • Nam, Ki-ahn;Kwon, Oh-young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.645-648
    • /
    • 2017
  • The existing embedded tester used TCP and shared file system for log processing. In addition, the existing processing method was treated as 1-N structure. This method wastes resources of the tester for exception handling. We implemented a log processing message layer that can be distributed by messaging system. And we compare the transmission method using the message layer and the transmission method using TCP and the shared file system. As a result of comparison, transmission using the message layer showed higher transmission bandwidth than TCP. In the CPU usage, the message layer showed lower efficiency than TCP, but showed no significant difference. It can be seen that the log processing using the message layer shows higher efficiency.

  • PDF

A Method for Analyzing Web Log of the Hadoop System for Analyzing a Effective Pattern of Web Users (효과적인 웹 사용자의 패턴 분석을 위한 하둡 시스템의 웹 로그 분석 방안)

  • Lee, Byungju;Kwon, Jungsook;Go, Gicheol;Choi, Yonglak
    • Journal of Information Technology Services
    • /
    • v.13 no.4
    • /
    • pp.231-243
    • /
    • 2014
  • Of the various data that corporations can approach, web log data are important data that correspond to data analysis to implement customer relations management strategies. As the volume of approachable data has increased exponentially due to the Internet and popularization of smart phone, web log data have also increased a lot. As a result, it has become difficult to expand storage to process large amounts of web logs data flexibly and extremely hard to implement a system capable of categorizing, analyzing, and processing web log data accumulated over a long period of time. This study thus set out to apply Hadoop, a distributed processing system that had recently come into the spotlight for its capacity of processing large volumes of data, and propose an efficient analysis plan for large amounts of web log. The study checked the forms of web log by the effective web log collection methods and the web log levels by using Hadoop and proposed analysis techniques and Hadoop organization designs accordingly. The present study resolved the difficulty with processing large amounts of web log data and proposed the activity patterns of users through web log analysis, thus demonstrating its advantages as a new means of marketing.

Accuracy of the Automating Program of Log Scaling (통나무 자로재기의 자동화 프로그램에 대한 정확성 평가)

  • Kim, Chan-Hoe;Byun, Sang-Woo
    • Journal of Information Technology Services
    • /
    • v.12 no.4
    • /
    • pp.165-174
    • /
    • 2013
  • Log scaling which decides a quality grade of log is influence the price of log at the market. It is the one of important works at the field until now. So it remains using a ruler traditionally. This study evaluated the automating program through compared the automating program with using a ruler for log. The automating program used libraries of OpenCV concerning image processing algorithm to measure log diameter for scaling. In addition, it applies two panels of checkered pattern beside a pile of logs and tapes on the surface of a log diameter to find a correct value. We analyzed statistical mean difference of both log diameter and volume. In conclusion, the automating program after applying check panel and taping ins't different using a ruler. Therefore we need to considerate about applying it for improving Forest Administration.

Microbial Contamination of Seasoned and Dried Squid Dosidicus gigas during Processing (조미오징어(Dosidicus gigas)의 가공 공정 중 미생물 오염도 및 오염원에 관한 연구)

  • Choi, Kyoo-Duck;Park, Uk-Yeon;Shin, Il-Shik
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.45 no.5
    • /
    • pp.445-453
    • /
    • 2012
  • This study examined microbial contamination during seasoned and dried squid Dosidicus gigas processing, including the apparatus, machines, and employee's gloves at each step in processing at two companies. The numbers of bacteria floating in air in each processing area were also examined. The numbers of Staphylococcus aureus (3.6-6.0 log CFU/g) and Escherichia coli (1.3-1.4 log MPN/100 g) in domestic and imported daruma (a semi-processed product of seasoned and dried squid) at companies A and B exceeded the regulatory limits of the Food Sanitary Law of Korea (S. aureus, ${\leq}2.0$ log CFU/g; E. coli, negative). S. aureus in both daruma was reduced to below the detection limit or 3.6 log CFU/g after the roasting step, but increased again to 3.3 and 5.5 log CFU/g after the mechanical tearing step at companies A and B, respectively. E. coli showed similar tendencies at both companies. The surfaces of the apparatus, machines, and employee's gloves that contacted daruma were also contaminated with S. aureus (1.0-5.5 log CFU/$m^2$) and E. coli (negative-to 3.5 log MPN/$m^2$). The numbers of bacteria floating in air were high (1.7-5.1 log CFU/$m^3$) at both companies. These results suggest that sanitation standard operating procedures (SSOP) must be developed to control of microbial cintamination in seasoned and dried squid.

Development of Log Processing Module and Log Server for Ethernet Shipboard Integration Networks (이더넷 기반 선박 통합 네트워크를 위한 로그 처리 모듈 및 로그 서버의 개발)

  • Hwang, Hun-Gyu;Yoon, Jin-Sik;Seo, Jeong-Min;Lee, Seong-Dae;Jang, Kil-Woong;Park, Hyu-Chan;Lee, Jang-Se
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.2
    • /
    • pp.331-338
    • /
    • 2011
  • Objectives of shipboard integration networks are to exchange and manage integrated information. Shipboard integration networks use UDP(User Datagram Protocol) multicast for the exchange of information. However, such information can be missed or damaged because UDP can't guarantee reliability. The standard of shipboard integration networks defines error log functions for the missed or damaged information. In this paper, we analyze internal and external log functions. The internal log function records errors internally, and the external log function sends error messages to a log server and records them in a database. We also develop a log processing module and log server for the external log function.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

Auto Configuration Module for Logstash in Elasticsearch Ecosystem

  • Ahmed, Hammad;Park, Yoosang;Choi, Jongsun;Choi, Jaeyoung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.39-42
    • /
    • 2018
  • Log analysis and monitoring have a significant importance in most of the systems. Log management has core importance in applications like distributed applications, cloud based applications, and applications designed for big data. These applications produce a large number of log files which contain essential information. This information can be used for log analytics to understand the relevant patterns from varying log data. However, they need some tools for the purpose of parsing, storing, and visualizing log informations. "Elasticsearch, Logstash, and Kibana"(ELK Stack) is one of the most popular analyzing tools for log management. For the ingestion of log files configuration files have a key importance, as they cover all the services needed to input, process, and output the log files. However, creating configuration files is sometimes very complicated and time consuming in many applications as it requires domain expertise and manual creation. In this paper, an auto configuration module for Logstash is proposed which aims to auto generate the configuration files for Logstash. The primary purpose of this paper is to provide a mechanism, which can be used to auto generate the configuration files for corresponding log files in less time. The proposed module aims to provide an overall efficiency in the log management system.

UX Analysis for Mobile Devices Using MapReduce on Distributed Data Processing Platform (MapReduce 분산 데이터처리 플랫폼에 기반한 모바일 디바이스 UX 분석)

  • Kim, Sungsook;Kim, Seonggyu
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.9
    • /
    • pp.589-594
    • /
    • 2013
  • As the concept of web characteristics represented by openness and mind sharing grows more and more popular, device log data generated by both users and developers have become increasingly complicated. For such reasons, a log data processing mechanism that automatically produces meaningful data set from large amount of log records have become necessary for mobile device UX(User eXperience) analysis. In this paper, we define the attributes of to-be-analyzed log data that reflect the characteristics of a mobile device and collect real log data from mobile device users. Along with the MapReduce programming paradigm in Hadoop platform, we have performed a mobile device User eXperience analysis in a distributed processing environment using the collected real log data. We have then demonstrated the effectiveness of the proposed analysis mechanism by applying the various combinations of Map and Reduce steps to produce a simple data schema from the large amount of complex log records.

Application of ATP Bioluminescence Method for Measurement of Microbial Contamination in Raw Meat, Meat and Dairy Processing Line (식육 및 육가공 . 유가공 생산라인의 환경미생물오염도 측정을 위한 ATP 방법의 이용)

  • 강현미;엄양섭;안흥석;김천제;최경환;정충일
    • Journal of Food Hygiene and Safety
    • /
    • v.15 no.3
    • /
    • pp.252-255
    • /
    • 2000
  • This study was conducted to investigate the application of ATP bioluminescence to measure the degree of microbial contamination from raw meat, meat processing and milk processing lines. Samples collected from slaughter house, meat and milk processing plants were tested for estimation of bacterial number by using ATP bioluminescence and conventional method. The former result was transffered to R-mATP value(log RLU/ml), and the latter transffered to CFU(log/ml). Correlation coefficient(r) between aerobic counts(CFU, log/ml) and R-mATP(log RLU/ml) value was 0.93(n=408). R-mATP of aerobic counts from beef, pork, chicken was 0.93(n=220), and that was 0.93(n=187) between meat processing and dairy processing plants. In addition, Correlation coefficient(r) between aerobic counts and R-mATP was 0.87(n=252) under 1$\times$10${^5}$/ml of bacterial count and 0.74(n=152) over 10${^5}$ respectively.

  • PDF