• Title/Summary/Keyword: Distributed Data Analysis

Search Result 2,340, Processing Time 0.028 seconds

An Iterative Algorithm for the Bottom Up Computation of the Data Cube using MapReduce (맵리듀스를 이용한 데이터 큐브의 상향식 계산을 위한 반복적 알고리즘)

  • Lee, Suan;Jo, Sunhwa;Kim, Jinho
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.4
    • /
    • pp.455-464
    • /
    • 2012
  • Due to the recent data explosion, methods which can meet the requirement of large data analysis has been studying. This paper proposes MRIterativeBUC algorithm which enables efficient computation of large data cube by distributed parallel processing with MapReduce framework. MRIterativeBUC algorithm is developed for efficient iterative operation of the BUC method with MapReduce, and overcomes the limitations about the storage size and processing ability caused by large data cube computation. It employs the idea from the iceberg cube which computes only the interesting aspect of analysts and the distributed parallel process of cube computation by partitioning and sorting. Thus, it reduces data emission so that it can reduce network overload, processing amount on each node, and eventually the cube computation cost. The bottom-up cube computation and iterative algorithm using MapReduce, proposed in this paper, can be expanded in various way, and will make full use of many applications.

Flood Runoff Simulation using Radar Rainfall and Distributed Hydrologic Model in Un-Gauged Basin : Imjin River Basin (레이더 강우와 분포형 수문모형을 이용한 미계측 유역의 홍수 유출모의: 임진강 유역)

  • Kim, Byung-Sik;Bae, Young-Hye;Park, Jung-Sool;Kim, Kyung-Tak
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.11 no.3
    • /
    • pp.52-67
    • /
    • 2008
  • Recently, frequent occurrence of flash floods caused by climactic change has necessitated prompt and quantitative prediction of precipitation. In particular, the usability of rainfall radar that can carry out real-time observation and prediction of precipitation behavior has increased. Moreover, the use of distributed hydrological model that enables grid level analysis has increased for an efficient use of rainfall radar that provides grid data at 1km resolution. The use of distributed hydrologic model necessitates grid-type spatial data about target basins; to enhance reliability of flood runoff simulation, the use of visible and precise data is necessary. In this paper, physically based $Vflo^{TM}$ model and ModClark, a quasi-distributed hydrological model, were used to carry out flood runoff simulation and comparison of simulation results with data from Imjin River Basin, two-third of which is ungauged. The spatial scope of this study was divided into the whole Imjin River basin area, which includes ungauged area, and Imjin River basin area in South Korea for which relatively accurate and visible data are available. Peak flow and lag time outputs from the two simulations of each region were compared to analyze the impact of uncertainty in topographical parameters and soil parameters on flood runoff simulation and to propose effective methods for flood runoff simulation in ungauged regions.

  • PDF

Evaluation of Distributed Intrusion Detection System Based on MongoDB (MongoDB 기반의 분산 침입탐지시스템 성능 평가)

  • Han, HyoJoon;Kim, HyukHo;Kim, Yangwoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.12
    • /
    • pp.287-296
    • /
    • 2019
  • Due to the development and increased usage of Internet services such as IoT and cloud computing, a large number of packets are being generated on the Internet. In order to create a safe Internet environment, malicious data that may exist among these packets must be processed and detected quickly. In this paper, we apply MongoDB, which is specialized for unstructured data analysis and big data processing, to intrusion detection system for rapid processing of big data security events. In addition, building the intrusion detection system(IDS) using some of the private cloud resources which is the target of protection, elastic and dynamic reconfiguration of the IDS is made possible as the number of security events increase or decrease. In order to evaluate the performance of MongoDB - based IDS proposed in this paper, we constructed prototype systems of IDS based on MongoDB as well as existing relational database, and compared their performance. Moreover, the number of virtual machine has been increased to find out the performance change as the IDS is distributed. As a result, it is shown that the performance is improved as the number of virtual machine is increased to make IDS distributed in MongoDB environment but keeping the overall system performance unchanged. The security event input rate based on distributed MongoDB was faster as much as 60%, and distributed MongoDB-based intrusion detection rate was faster up to 100% comparing to the IDS based on relational database.

Performance Analysis of Bio-gas Micro Gas Turbine System (바이오가스 마이크로 가스터빈 성능해석)

  • Hur, Kwang-Beom;Park, Jung-Keuk;Rhim, Sang-Gyu;Kim, Jae-Hoon
    • 한국신재생에너지학회:학술대회논문집
    • /
    • 2008.05a
    • /
    • pp.239-242
    • /
    • 2008
  • As the distributed generation becomes more reliable and economically feasible, it is expected that a higher application of the distributed generation units would be interconnected to the existing grids. In this context, the Micro Gas Turbines (MGT) by using Bio-gas is being considered as a promising solution. In order to propose a feasible concept of those technologies such as improving environmental effect and economics, we performed a sensitivity study for a biomass fueled MGT using a simulation model. The study consists of 1) the fundamental modeling using manufacturer's technical specifications, 2) the correction with the experimental data, and 3) the prediction of off-design characteristics. The performance analysis model was developed by PEPSE-GT 72, commercial steam/gas turbine simulation technicque.

  • PDF

Distributed Air Defense Simulation Model and its Applications (방공교전모델(DADSim) 개발 및 활용사례)

  • 최상영;김의환
    • Journal of the military operations research society of Korea
    • /
    • v.27 no.2
    • /
    • pp.134-148
    • /
    • 2001
  • In this paper, air-defense simulation model, called "DADSim", will be introduced. DADSim(Distributed Air Defense Simulation Model) was developed by Modeling&Simulation Lab of K.N.D.U.(Korea National Defence Univ) Weapon Systems Department. This model is an analysis-purpose model in the engagement-level. DADSim can simulate not only the global air-defense or Korean Peninsula but also the local air-defense or a battle field. DADSim uses the DTED(digital terrain elevation data) LeveII it for the representation of peninsula terrain characteristics. The weapon systems cooperated in the model are low/medium-range missile systems such as HAWK, NIKE, SAM. DADSim was designed in the way of object-oriented development method, implemented by C++ language. The simulation view is an event-sequenced object-orientation. For the convenience of input, output analysis, GUI(Graphic User Interface) of menu, window, dialog box, etc. are provided to the user, For the execution of DADSim, Silicon Graphic IRIX 6.3 or high version is required. DADSim can be used for the effectiveness analysis of­defence systems. Some illustrative examples will be shown in this paper.

  • PDF

Measurement and Monitoring of Mechanical Loads of Wind Turbines Using Distributed Fiber Optic Sensor (분포형 광섬유 센서를 이용한 풍력발전기의 기계적 부하 측정 및 모니터링)

  • Lee, Jong-Won;Huh, Young-Cheol;Nam, Yong-Yun;Lee, Geun-Ho;Kim, Yoo-Sung;Lee, Yong-Bae
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.17 no.11
    • /
    • pp.1028-1036
    • /
    • 2007
  • A method for measurement and monitoring of mechanical loads in large slender structures such as wind turbine blade and tower is presented based on continuous strain data obtained from distributed fiber optic sensor. An experimental study was carried out on an aluminum cantilever beam. A static load test was performed and the calculated moment from the distributed fiber optic sensor agree well with the actual applied moment. A series of damages was inflicted on the beam, and vibration tests were carried out for each damage case. The estimated natural frequencies from the distributed fiber optic sensor for each damage case are found to compare well with those from a conventional accelerometer and a numerical analysis based on an energy method.

The Effect of SMEs' Slack Resource on Internationalization: Focusing on SMEs' Subcontracting Relationship

  • KIM, Jae-Jin
    • East Asian Journal of Business Economics (EAJBE)
    • /
    • v.9 no.1
    • /
    • pp.17-26
    • /
    • 2021
  • Purpose-This study examines how financial slack resources and subcontracting of small and medium-sized enterprises (SMEs) affect their internationalization. To identify slack resources, subcontracting, and internationalization of SMEs, 1,062 SME samples in the electronics industry are used in the logistic regression analysis to analyze their relationship with SMEs' export. Research design, data, and methodology-This study conducted the empirical analysis on 1,062 SMEs in the electronics industry using the sample survey method. The samples were based on data selected and distributed by the Ministry SMEs and Startups. The data analysis methods were descriptive, correlation analysis, and logistics regression analysis. Result-The analysis shows that only available resources are negatively related to SMEs' internationalization. It can be interpreted as a high tendency for SMEs to avoid relatively risky choices such as entering overseas markets if they have enough financial resources. Moreover, subcontracting has a negative relationship with internationalization. Conclusion-This study broadened the scope of SME research by analyzing subcontracting and slack resources together and provides practical implications for policymakers and managers.

Analyzing Operation Deviation in the Deasphalting Process Using Multivariate Statistics Analysis Method

  • Park, Joo-Hwang;Kim, Jong-Soo;Kim, Tai-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.7
    • /
    • pp.858-865
    • /
    • 2014
  • In the case of system like MES, various sensors collect the data in real time and save it as a big data to monitor the process. However, if there is big data mining in distributed computing system, whole processing process can be improved. In this paper, system to analyze the cause of operation deviation was built using the big data which has been collected from deasphalting process at the two different plants. By applying multivariate statistical analysis to the big data which has been collected through MES(Manufacturing Execution System), main cause of operation deviation was analyzed. We present the example of analyzing the operation deviation of deasphalting process using the big data which collected from MES by using multivariate statistics analysis method. As a result of regression analysis of the forward stepwise method, regression equation has been found which can explain 52% increase of performance compare to existing model. Through this suggested method, the existing petrochemical process can be replaced which is manual analysis method and has the risk of being subjective according to the tester. The new method can provide the objective analysis method based on numbers and statistic.

A Study on Phon Call Big Data Analytics (전화통화 빅데이터 분석에 관한 연구)

  • Kim, Jeongrae;Jeong, Chanki
    • Journal of Information Technology and Architecture
    • /
    • v.10 no.3
    • /
    • pp.387-397
    • /
    • 2013
  • This paper proposes an approach to big data analytics for phon call data. The analytical models for phon call data is composed of the PVPF (Parallel Variable-length Phrase Finding) algorithm for identifying verbal phrases of natural language and the word count algorithm for measuring the usage frequency of keywords. In the proposed model, we identify words using the PVPF algorithm, and measure the usage frequency of the identified words using word count algorithm in MapReduce. The results can be interpreted from various viewpoints. We design and implement the model based HDFS (Hadoop Distributed File System), verify the proposed approach through a case study of phon call data. So we extract useful results through analysis of keyword correlation and usage frequency.

Biological Data Analysis using DDBJ Web services

  • Sugawara, Hideaki;Miyazaki, Satorn;Abe, Takashi;Shigemoto, Yasumasa
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.379-382
    • /
    • 2005
  • We demonstrate workflows in biological data retrieval and analysis using the DDBJ Web Service; specifically introduce a workflow for the analysis of proteins or proteomics data sets. The workflow mechanically extracts the gene whose protein structure and function are known from all the genes of a human genome in Ensembl (http://www.ensembl.org/) based on cross-references among Ensembl, Swiss-Prot (http://www.ebi.ac.uk/swissprot) and PDB (Protein Data Bank; http://www.wwpdb.org/). The workflow discovered ‘hidden’ linkages among databases. We will be able to integrate distributed and heterogeneous data systems into workflows, if they are provided based on standards for Web services.

  • PDF