Search | Korea Science

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
- Journal of Internet Computing and Services
- /
- v.14 no.6
- /
- pp.71-84
- /
- 2013
Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.
https://doi.org/10.7472/jksii.2013.14.6.71 인용 PDF KSCI

Climate Change Impact on Nonpoint Source Pollution in a Rural Small Watershed (기후변화에 따른 농촌 소유역에서의 비점오염 영향 분석)

Hwang, Sye-Woon;Jang, Tae-Il;Park, Seung-Woo
- Korean Journal of Agricultural and Forest Meteorology
- /
- v.8 no.4
- /
- pp.209-221
- /
- 2006
The purpose of this study is to analyze the effects of climate change on the nonpoint source pollution in a small watershed using a mid-range model. The study area is a basin in a rural area that covers 384 ha with a composition of 50% forest and 19% paddy. The hydrologic and water quality data were monitored from 1996 to 2004, and the feasibility of the GWLF (Generalized Watershed Loading function) model was examined in the agricultural small watershed using the data obtained from the study area. As one of the studies on climate change, KEI (Korea Environment Institute) has presented the monthly variation ratio of rainfall in Korea based on the climate change scenario for rainfall and temperature. These values and observed daily rainfall data of forty-one years from 1964 to 2004 in Suwon were used to generate daily weather data using the stochastic weather generator model (WGEN). Stream runoff was calibrated by the data of $1996{\sim}1999$ and was verified in $2002{\sim}2004$. The results were determination coeff, ($R^2$) of $0.70{\sim}0.91$ and root mean square error (RMSE) of $2.11{\sim}5.71$. Water quality simulation for SS, TN and TP showed $R^2$ values of 0.58, 0.47 and 0.62, respectively, The results for the impact of climate change on nonpoint source pollution show that if the factors of watershed are maintained as in the present circumstances, pollutant TN loads and TP would be expected to increase remarkably for the rainy season in the next fifty years.
PDF KSCI

Study of Motion-induced Dose Error Caused by Irregular Tumor Motion in Helical Tomotherapy (나선형 토모테라피에서 불규칙적인 호흡으로 발생되는 움직임에 의한 선량 오차에 대한 연구)

Cho, Min-Seok;Kim, Tae-Ho;Kang, Seong-Hee;Kim, Dong-Su;Kim, Kyeong-Hyeon;Cheon, Geum Seong;Suh, Tae Suk
- Progress in Medical Physics
- /
- v.26 no.3
- /
- pp.119-126
- /
- 2015
The purpose of this study is to analyze motion-induced dose error generated by each tumor motion parameters of irregular tumor motion in helical tomotherapy. To understand the effect of the irregular tumor motion, a simple analytical model was simulated. Moving cases that has tumor motion were divided into a slightly irregular tumor motion case, a large irregular tumor motion case and a patient case. The slightly irregular tumor motion case was simulated with a variability of 10% in the tumor motion parameters of amplitude (amplitude case), period (period case), and baseline (baseline case), while the large irregular tumor motion case was simulated with a variability of 40%. In the phase case, the initial phase of the tumor motion was divided into end inhale, mid exhale, end exhale, and mid inhale; the simulated dose profiles for each case were compared. The patient case was also investigated to verify the motion-induced dose error in 'clinical-like' conditions. According to the simulation process, the dose profile was calculated. The moving case was compared with the static case that has no tumor motion. In the amplitude, period, baseline cases, the results show that the motion-induced dose error in the large irregular tumor motion case was larger than that in the slightly irregular tumor motion case or regular tumor motion case. Because the offset effect was inversely proportion to irregularity of tumor motion, offset effect was smaller in the large irregular tumor motion case than the slightly irregular tumor motion case or regular tumor motion case. In the phase case, the larger dose discrepancy was observed in the irregular tumor motion case than regular tumor motion case. A larger motion-induced dose error was also observed in the patient case than in the regular tumor motion case. This study analyzed motion-induced dose error as a function of each tumor motion parameters of irregular tumor motion during helical tomotherapy. The analysis showed that variability control of irregular tumor motion is important. We believe that the variability of irregular tumor motion can be reduced by using abdominal compression and respiratory training.
https://doi.org/10.14316/pmp.2015.26.3.119 인용 PDF KSCI

Search Result 703, Processing Time 0.024 seconds

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

Climate Change Impact on Nonpoint Source Pollution in a Rural Small Watershed (기후변화에 따른 농촌 소유역에서의 비점오염 영향 분석)

Study of Motion-induced Dose Error Caused by Irregular Tumor Motion in Helical Tomotherapy (나선형 토모테라피에서 불규칙적인 호흡으로 발생되는 움직임에 의한 선량 오차에 대한 연구)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)