Search | Korea Science

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
- Journal of Internet Computing and Services
- /
- v.14 no.6
- /
- pp.71-84
- /
- 2013
Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.
https://doi.org/10.7472/jksii.2013.14.6.71 인용 PDF KSCI

Development of a Stock Trading System Using M & W Wave Patterns and Genetic Algorithms (M&W 파동 패턴과 유전자 알고리즘을 이용한 주식 매매 시스템 개발)

Yang, Hoonseok;Kim, Sunwoong;Choi, Heung Sik
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.63-83
- /
- 2019
Investors prefer to look for trading points based on the graph shown in the chart rather than complex analysis, such as corporate intrinsic value analysis and technical auxiliary index analysis. However, the pattern analysis technique is difficult and computerized less than the needs of users. In recent years, there have been many cases of studying stock price patterns using various machine learning techniques including neural networks in the field of artificial intelligence(AI). In particular, the development of IT technology has made it easier to analyze a huge number of chart data to find patterns that can predict stock prices. Although short-term forecasting power of prices has increased in terms of performance so far, long-term forecasting power is limited and is used in short-term trading rather than long-term investment. Other studies have focused on mechanically and accurately identifying patterns that were not recognized by past technology, but it can be vulnerable in practical areas because it is a separate matter whether the patterns found are suitable for trading. When they find a meaningful pattern, they find a point that matches the pattern. They then measure their performance after n days, assuming that they have bought at that point in time. Since this approach is to calculate virtual revenues, there can be many disparities with reality. The existing research method tries to find a pattern with stock price prediction power, but this study proposes to define the patterns first and to trade when the pattern with high success probability appears. The M & W wave pattern published by Merrill(1980) is simple because we can distinguish it by five turning points. Despite the report that some patterns have price predictability, there were no performance reports used in the actual market. The simplicity of a pattern consisting of five turning points has the advantage of reducing the cost of increasing pattern recognition accuracy. In this study, 16 patterns of up conversion and 16 patterns of down conversion are reclassified into ten groups so that they can be easily implemented by the system. Only one pattern with high success rate per group is selected for trading. Patterns that had a high probability of success in the past are likely to succeed in the future. So we trade when such a pattern occurs. It is a real situation because it is measured assuming that both the buy and sell have been executed. We tested three ways to calculate the turning point. The first method, the minimum change rate zig-zag method, removes price movements below a certain percentage and calculates the vertex. In the second method, high-low line zig-zag, the high price that meets the n-day high price line is calculated at the peak price, and the low price that meets the n-day low price line is calculated at the valley price. In the third method, the swing wave method, the high price in the center higher than n high prices on the left and right is calculated as the peak price. If the central low price is lower than the n low price on the left and right, it is calculated as valley price. The swing wave method was superior to the other methods in the test results. It is interpreted that the transaction after checking the completion of the pattern is more effective than the transaction in the unfinished state of the pattern. Genetic algorithms(GA) were the most suitable solution, although it was virtually impossible to find patterns with high success rates because the number of cases was too large in this simulation. We also performed the simulation using the Walk-forward Analysis(WFA) method, which tests the test section and the application section separately. So we were able to respond appropriately to market changes. In this study, we optimize the stock portfolio because there is a risk of over-optimized if we implement the variable optimality for each individual stock. Therefore, we selected the number of constituent stocks as 20 to increase the effect of diversified investment while avoiding optimization. We tested the KOSPI market by dividing it into six categories. In the results, the portfolio of small cap stock was the most successful and the high vol stock portfolio was the second best. This shows that patterns need to have some price volatility in order for patterns to be shaped, but volatility is not the best.
https://doi.org/10.13088/jiis.2019.25.1.063 인용 PDF KSCI HTML

Evaluation of Approximate Exposure to Low-dose Ionizing Radiation from Medical Images using a Computed Radiography (CR) System (전산화 방사선촬영(CR) 시스템을 이용한 근사적 의료 피폭 선량 평가)

Yu, Minsun;Lee, Jaeseung;Im, Inchul
- Journal of the Korean Society of Radiology
- /
- v.6 no.6
- /
- pp.455-464
- /
- 2012
This study suggested evaluation of approximately exposure to low-dose ionization radiation from medical images using a computed radiography (CR) system in standard X-ray examination and experimental model can compare diagnostic reference level (DRL) will suggest on optimization condition of guard about medical radiation of low dose space. Entrance surface dose (ESD) cross-measuring by standard dosimeter and optically stimulated luminescence dosimeters (OSLDs) in experiment condition about tube voltage and current of X-ray generator. Also, Hounsfield unit (HU) scale measured about each experiment condition in CR system and after character relationship table and graph tabulate about ESD and HU scale, approximately radiation dose about head, neck, thoracic, abdomen, and pelvis draw a measurement. In result measuring head, neck, thoracic, abdomen, and pelvis, average of ESD is 2.10, 2.01, 1.13, 2.97, and 1.95 mGy, respectively. HU scale is $3,276{\pm}3.72$, $3,217{\pm}2.93$, $2,768{\pm}3.13$, $3,782{\pm}5.19$, and $2,318{\pm}4.64$, respectively, in CR image. At this moment, using characteristic relationship table and graph, ESD measured approximately 2.16, 2.06, 1.19, 3.05, and 2.07 mGy, respectively. Average error of measuring value and ESD measured approximately smaller than 3%, this have credibility cover all the bases radiology area of measurement 5%. In its final analysis, this study suggest new experimental model approximately can assess radiation dose of patient in standard X-ray examination and can apply to CR examination, digital radiography and even film-cassette system.
https://doi.org/10.7742/jksr.2012.6.6.455 인용 PDF KSCI

Voltage-Frequency-Island Aware Energy Optimization Methodology for Network-on-Chip Design (전압-주파수-구역을 고려한 에너지 최적화 네트워크-온-칩 설계 방법론)

Kim, Woo-Joong;Kwon, Soon-Tae;Shin, Dong-Kun;Han, Tae-Hee
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.46 no.8
- /
- pp.22-30
- /
- 2009
Due to high levels of integration and complexity, the Network-on-Chip (NoC) approach has emerged as a new design paradigm to overcome on-chip communication issues and data bandwidth limits in conventional SoC(System-on-Chip) design. In particular, exponentially growing of energy consumption caused by high frequency, synchronization and distributing a single global clock signal throughout the chip have become major design bottlenecks. To deal with these issues, a globally asynchronous, locally synchronous (GALS) design combined with low power techniques is considered. Such a design style fits nicely with the concept of voltage-frequency-islands (VFI) which has been recently introduced for achieving fine-grain system-level power management. In this paper, we propose an efficient design methodology that minimizes energy consumption by VFI partitioning on an NoC architecture as well as assigning supply and threshold voltage levels to each VFI. The proposed algorithm which find VFI and appropriate core (or processing element) supply voltage consists of traffic-aware core graph partitioning, communication contention delay-aware tile mapping, power variation-aware core dynamic voltage scaling (DVS), power efficient VFI merging and voltage update on the VFIs Simulation results show that average 10.3% improvement in energy consumption compared to other existing works.
PDF KSCI

A New Clock Routing Algorithm for High Performance ICs (고성능 집적회로 설계를 위한 새로운 클락 배선)

유광기;정정화
- Journal of the Korean Institute of Telematics and Electronics C
- /
- v.36C no.11
- /
- pp.64-74
- /
- 1999
A new clock skew optimization for clock routing using link-edge insertion is proposed in this paper. It satisfies the given skew bound and prevent the total wire length from increasing. As the clock skew is the major constraint for high speed synchronous ICs, it must be minimized in order to obtain high performance. But clock skew minimization can increase total wire length, therefore clock routing is performed within the given skew bound which can not induce the malfunction. Clock routing under the specified skew bound can decrease total wire length Not only total wire length and delay time minimization algorithm using merging point relocation method but also clock skew reduction algorithm using link-edge insertion technique between two nodes whose delay difference is large is proposed. The proposed algorithm construct a new clock routing topology which is generalized graph model while previous methods uses only tree-structured routing topology. A new cost function is designed in order to select two nodes which constitute link-edge. Using this cost function, delay difference or clock skew is reduced by connecting two nodes whose delay difference is large and distance difference is short. Furthermore, routing topology construction and wire sizing algorithm is developed to reduce clock delay. The proposed algorithm is implemented in C programming language. From the experimental results, we can get the delay reduction under the given skew bound.
PDF

Development of Remote Measurement Method for Reinforcement Information in Construction Field Using 360 Degrees Camera (360도 카메라 기반 건설현장 철근 배근 정보 원격 계측 기법 개발)

Lee, Myung-Hun;Woo, Ukyong;Choi, Hajin;Kang, Su-min;Choi, Kyoung-Kyu
- Journal of the Korea institute for structural maintenance and inspection
- /
- v.26 no.6
- /
- pp.157-166
- /
- 2022
Structural supervision on the construction site has been performed based on visual inspection, which is highly labor-intensive and subjective. In this study, the remote technique was developed to improve the efficiency of the measurements on rebar spacing using a 360° camera and reconstructed 3D models. The proposed method was verified by measuring the spacings in reinforced concrete structure, where the twelve locations in the construction site (265 m²) were scanned within 20 seconds per location and a total of 15 minutes was taken. SLAM, consisting of SIFT, RANSAC, and General framework graph optimization algorithms, produces RGB-based 3D and 3D point cloud models, respectively. The minimum resolution of the 3D point cloud was 0.1mm while that of the RGB-based 3D model was 10 mm. Based on the results from both 3D models, the measurement error was from 10.8% to 0.3% in the 3D point cloud and from 28.4% to 3.1% in the RGB-based 3D model. The results demonstrate that the proposed method has great potential for remote structural supervision with respect to its accuracy and objectivity.
https://doi.org/10.11112/jksmi.2022.26.6.157 인용 PDF KSCI

A new approach to design isolation valve system to prevent unexpected water quality failures (수질사고 예방형 상수도 관망 밸브 시스템 설계)

Park, Kyeongjin;Shin, Geumchae;Lee, Seungyub
- Journal of Korea Water Resources Association
- /
- v.55 no.spc1
- /
- pp.1211-1222
- /
- 2022
Abnormal condition inevitably occurs during operation of water distribution system (WDS) and requires the isolation of certain areas using isolation valves. In general, the determination of the optimal location of isolation valves considered minimization of hydraulic failures as isolation of certain areas causes a change in hydraulic states (e.g., flow direction, velocity, pressure, etc.). Water quality failure can also be induced by changes in hydraulics, which have not been considered for isolation valve system design. Therefore, this study proposes a new isolation valve system design methodology to prevent unexpected water quality failure events. The new methodology considers flow direction change ratio (FDCR), which accounts for flow direction changes after isolation of the area, as a constraint while reliability is used as the objective function. The optimal design model has been applied to a synthetic grid network and the results are compared with the traditional design approach. Results show that considering FDCR can eliminate flow direction changes while average pressure and coefficient of variation of pressure, velocity, and hydraulic geodesic index (HGI) outperform compared to the traditional design approach. The proposed methodology is expected to be a useful approach to minimizing unexpected consequences by traditional design approaches.
https://doi.org/10.3741/JKWRA.2022.55.S-1.1211 인용 PDF KSCI

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

Ahn, Hyunchul
- Information Systems Review
- /
- v.16 no.3
- /
- pp.161-177
- /
- 2014
Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.
https://doi.org/10.14329/isr.2014.16.3.161 인용 PDF

The Evaluation of Resolution Recovery Based Reconstruction Method, Astonish (Resolution Recovery 기반의 Astonish 영상 재구성 기법의 평가)

Seung, Jong-Min;Lee, Hyeong-Jin;Kim, Jin-Eui;Kim, Hyun-Joo;Kim, Joong-Hyun;Lee, Jae-Sung;Lee, Dong-Soo
- The Korean Journal of Nuclear Medicine Technology
- /
- v.15 no.1
- /
- pp.58-64
- /
- 2011
Objective: The 3-dimensional reconstruction method with resolution recovery modeling has advantages of high spatial resolution and contrast because of its precise modeling of spatial blurring according to the distance from detector plane. The aim of this study was to evaluate one of the resolution recovery reconstruction methods (Astonish, Philips Medical), compare it to other iterative reconstructions, and verify its clinical usefulness. Materials and Methods: NEMA IEC PET body phantom and Flanges Jaszczak ECT phantom (Data Spectrum Corp., USA) studies were performed using Skylight SPECT (Philips) system under four different conditions; short or long (2 times of short) radius, and half or full (40 kcts/frame) acquisition counts. Astonish reconstruction method was compared with two other iterative reconstructions; MLEM and 3D-OSEM which vendor supplied. For quantitative analysis, the contrast ratios obtained from IEC phantom test were compared. Reconstruction parameters were determined by optimization study using graph of contrast ratio versus background variability. The qualitative comparison was performed with Jaszczak ECT phantom and human myocardial data. Results: The overall contrast ratio was higher with Astonish than the others. For the largest hot sphere of 37 mm diameter, Astonish showed about 27.1% and 17.4% higher contrast ratio than MLEM and 3D-OSEM, in short radius study. For long radius, Astonish showed about 40.5% and 32.6% higher contrast ratio than MLEM and 3D-OSEM. The effect of acquired counts was insignificant. In the qualitative studies with Jaszczak phantom and human myocardial data, Astonish showed the best image quality. Conclusion: In this study, we have found out that Astonish can provide more reliable clinical results by better image quality compared to other iterative reconstruction methods. Although further clinical studies are required, Astonish would be used in clinics with confidence for enhancement of images.
PDF

Search Result 199, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)