Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)
-
- Journal of Internet Computing and Services
- /
- v.14 no.6
- /
- pp.71-84
- /
- 2013
Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.
Recently, as economic property it has become necessary to acquire and utilize the framework for water resource measurement and performance management as the property of water resources changes to hold "public property". To date, the evaluation of water technology has been carried out by feasibility study analysis or technology assessment based on net present value (NPV) or benefit-to-cost (B/C) effect, however it is not yet systemized in terms of valuation models to objectively assess an economic value of technology-based business to receive diffusion and feedback of research outcomes. Therefore, K-water (known as a government-supported public company in Korea) company feels the necessity to establish a technology valuation framework suitable for technical characteristics of water resources fields in charge and verify an exemplified case applied to the technology. The K-water evaluation technology applied to this study, as a public interest goods, can be used as a tool to measure the value and achievement contributed to society and to manage them. Therefore, by calculating the value in which the subject technology contributed to the entire society as a public resource, we make use of it as a basis information for the advertising medium of performance on the influence effect of the benefits or the necessity of cost input, and then secure the legitimacy for large-scale R&D cost input in terms of the characteristics of public technology. Hence, K-water company, one of the public corporation in Korea which deals with public goods of 'water resources', will be able to establish a commercialization strategy for business operation and prepare for a basis for the performance calculation of input R&D cost. In this study, K-water has developed a web-based technology valuation model for public interest type water resources based on the technology evaluation system that is suitable for the characteristics of a technology in water resources fields. In particular, by utilizing the evaluation methodology of the Institute of Advanced Industrial Science and Technology (AIST) in Japan to match the expense items to the expense accounts based on the related benefit items, we proposed the so-called 'K-water's proprietary model' which involves the 'cost-benefit' approach and the FCF (Free Cash Flow), and ultimately led to build a pipeline on the K-water research performance management system and then verify the practical case of a technology related to "desalination". We analyze the embedded design logic and evaluation process of web-based valuation system that reflects characteristics of water resources technology, reference information and database(D/B)-associated logic for each model to calculate public interest-based and profit-based technology values in technology integrated management system. We review the hybrid evaluation module that reflects the quantitative index of the qualitative evaluation indices reflecting the unique characteristics of water resources and the visualized user-interface (UI) of the actual web-based evaluation, which both are appended for calculating the business value based on financial data to the existing web-based technology valuation systems in other fields. K-water's technology valuation model is evaluated by distinguishing between public-interest type and profitable-type water technology. First, evaluation modules in profit-type technology valuation model are designed based on 'profitability of technology'. For example, the technology inventory K-water holds has a number of profit-oriented technologies such as water treatment membranes. On the other hand, the public interest-type technology valuation is designed to evaluate the public-interest oriented technology such as the dam, which reflects the characteristics of public benefits and costs. In order to examine the appropriateness of the cost-benefit based public utility valuation model (i.e. K-water specific technology valuation model) presented in this study, we applied to practical cases from calculation of benefit-to-cost analysis on water resource technology with 20 years of lifetime. In future we will additionally conduct verifying the K-water public utility-based valuation model by each business model which reflects various business environmental characteristics.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70