• Title/Summary/Keyword: Distributed Data Analysis

Search Result 2,350, Processing Time 0.036 seconds

Density Aware Energy Efficient Clustering Protocol for Normally Distributed Sensor Networks

  • Su, Xin;Choi, Dong-Min;Moh, Sang-Man;Chung, Il-Yong
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.6
    • /
    • pp.911-923
    • /
    • 2010
  • In wireless sensor networks (WSNs), cluster based data routing protocols have the advantages of reducing energy consumption and link maintenance cost. Unfortunately, most of clustering protocols have been designed for uniformly distributed sensor networks. However, some urgent situations do not allow thousands of sensor nodes being deployed uniformly. For example, air vehicles or balloons may take the responsibility for deploying sensor nodes hence leading a normally distributed topology. In order to improve energy efficiency in such sensor networks, in this paper, we propose a new cluster formation algorithm named DAEEC (Density Aware Energy-Efficient Clustering). In this algorithm, we define two kinds of clusters: Low Density (LD) clusters and High Density (HD) clusters. They are determined by the number of nodes participated in one cluster. During the data routing period, the HD clusters help the neighbor LD clusters to forward the sensed data to the central base station. Thus, DAEEC can distribute the energy dissipation evenly among all sensor nodes by considering the deployment density to improve network lifetime and average energy savings. Moreover, because the HD clusters are densely deployed they can work in a manner of our former algorithm EEVAR (Energy Efficient Variable Area Routing Protocol) to save energy. According to the performance analysis result, DAEEC outperforms the conventional data routing schemes in terms of energy consumption and network lifetime.

MAHA-FS : A Distributed File System for High Performance Metadata Processing and Random IO (MAHA-FS : 고성능 메타데이터 처리 및 랜덤 입출력을 위한 분산 파일 시스템)

  • Kim, Young Chang;Kim, Dong Oh;Kim, Hong Yeon;Kim, Young Kyun;Choi, Wan
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.91-96
    • /
    • 2013
  • The application field of supercomputing systems are changing to support into the field for both a large-volume data processing and high-performance computing at the same time such as bio-applications. These applications require high-performance distributed file system for storage management and efficient high-speed processing of large amounts of data that occurs. In this paper, we introduce MAHA-FS for supercomputing systems for processing large amounts of data and high-performance computing, providing excellent metadata operation performance and IO performance. It is shown through performance analysis that MAHA-FS provides excellent performance in terms of the metadata processing and random IO processing.

TeT: Distributed Tera-Scale Tensor Generator (분산 테라스케일 텐서 생성기)

  • Jeon, ByungSoo;Lee, JungWoo;Kang, U
    • Journal of KIISE
    • /
    • v.43 no.8
    • /
    • pp.910-918
    • /
    • 2016
  • A tensor is a multi-dimensional array that represents many data such as (user, user, time) in the social network system. A tensor generator is an important tool for multi-dimensional data mining research with various applications including simulation, multi-dimensional data modeling/understanding, and sampling/extrapolation. However, existing tensor generators cannot generate sparse tensors like real-world tensors that obey power law. In addition, they have limitations such as tensor sizes that can be processed and additional time required to upload generated tensor to distributed systems for further analysis. In this study, we propose TeT, a distributed tera-scale tensor generator to solve these problems. TeT generates sparse random tensor as well as sparse R-MAT and Kronecker tensor without any limitation on tensor sizes. In addition, a TeT-generated tensor is immediately ready for further tensor analysis on the same distributed system. The careful design of TeT facilitates nearly linear scalability on the number of machines.

A Form Based Distribution Design Methodology for Distributed Databases (분산 테이타베이스를 위한 양식을 이용한 분산 설계 방법론)

  • Lee, Hui-Seok;Kim, Hui-Jin;Kim, Yeong-Sam
    • Asia pacific journal of information systems
    • /
    • v.5 no.2
    • /
    • pp.101-129
    • /
    • 1995
  • This paper proposes a form-based distributed database design methodology ($FD^3$). The methodology consists of five design phases such as (i) form requirement analysis (ii) schema integration (iii) distribution analysis (iv) distribution design, and (v) local logical/physical design. In the $FD^3$, all the important design information for each phase is obtained by using an organizations forms, Users requirements are analyzed by using forms that contain logical and quantitative information for distribution design. $FD^3$ resolves naming conflicts by employing SQLs based on the form field data in the schema integration phase. Furthermore, $FD^3$ enhances the quality of distributed database design by incorporating communication costs into the design model. A real-life case is illustrated to demonstrate the usefulness of the $FD^3$.

  • PDF

Web-based CAE Service System for Collaborative Engineering Environment (협업 환경 기반 엔지니어링 해석 서비스 시스템 개발)

  • Kim K.I.;Kwon K.E.;Park J.H.;Choi Y.;Cho S.W.
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2006.05a
    • /
    • pp.619-620
    • /
    • 2006
  • In this paper, the CAE Service System for Collaborative Engineering Environment with web services and Multi-frontal Method has been investigated and developed. The enabling technologies such as SOAP and .NET Framework play great roles in the development of integrated distributed application software. In addition to the distribution of analysis modules, numerical solution process itself is again divided into parallel processes using Multi-frontal Method for computational efficiency. We believe that the proposed approach for the analysis can be extended to the entire product development process for sharing and utilizing common product data in the distributed engineering environment.

  • PDF

UX Analysis for Mobile Devices Using MapReduce on Distributed Data Processing Platform (MapReduce 분산 데이터처리 플랫폼에 기반한 모바일 디바이스 UX 분석)

  • Kim, Sungsook;Kim, Seonggyu
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.9
    • /
    • pp.589-594
    • /
    • 2013
  • As the concept of web characteristics represented by openness and mind sharing grows more and more popular, device log data generated by both users and developers have become increasingly complicated. For such reasons, a log data processing mechanism that automatically produces meaningful data set from large amount of log records have become necessary for mobile device UX(User eXperience) analysis. In this paper, we define the attributes of to-be-analyzed log data that reflect the characteristics of a mobile device and collect real log data from mobile device users. Along with the MapReduce programming paradigm in Hadoop platform, we have performed a mobile device User eXperience analysis in a distributed processing environment using the collected real log data. We have then demonstrated the effectiveness of the proposed analysis mechanism by applying the various combinations of Map and Reduce steps to produce a simple data schema from the large amount of complex log records.

Distributed Video Compressive Sensing Reconstruction by Adaptive PCA Sparse Basis and Nonlocal Similarity

  • Wu, Minghu;Zhu, Xiuchang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.8
    • /
    • pp.2851-2865
    • /
    • 2014
  • To improve the rate-distortion performance of distributed video compressive sensing (DVCS), the adaptive sparse basis and nonlocal similarity of video are proposed to jointly reconstruct the video signal in this paper. Due to the lack of motion information between frames and the appearance of some noises in the reference frames, the sparse dictionary, which is constructed using the examples directly extracted from the reference frames, has already not better obtained the sparse representation of the interpolated block. This paper proposes a method to construct the sparse dictionary. Firstly, the example-based data matrix is constructed by using the motion information between frames, and then the principle components analysis (PCA) is used to compute some significant principle components of data matrix. Finally, the sparse dictionary is constructed by these significant principle components. The merit of the proposed sparse dictionary is that it can not only adaptively change in terms of the spatial-temporal characteristics, but also has ability to suppress noises. Besides, considering that the sparse priors cannot preserve the edges and textures of video frames well, the nonlocal similarity regularization term has also been introduced into reconstruction model. Experimental results show that the proposed algorithm can improve the objective and subjective quality of video frame, and achieve the better rate-distortion performance of DVCS system at the cost of a certain computational complexity.

A Study on the Big Data Analysis System for Searching of the Flooded Road Areas (도로 침수영역의 탐색을 위한 빅데이터 분석 시스템 연구)

  • Song, Youngmi;Kim, Chang Soo
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.8
    • /
    • pp.925-934
    • /
    • 2015
  • The frequency of natural disasters because of global warming is gradually increasing, risks of flooding due to typhoon and torrential rain have also increased. Among these causes, the roads are flooded by suddenly torrential rain, and then vehicle and personal injury are happening. In this respect, because of the possibility that immersion of a road may occur in a second, it is necessary to study the rapid data collection and quick response system. Our research proposes a big data analysis system based on the collected information and a variety of system information collection methods for searching flooded road areas by torrential rains. The data related flooded roads are utilized the SNS data, meteorological data and the road link data, etc. And the big data analysis system is implemented the distributed processing system based on the Hadoop platform.

Big data-based piping material analysis framework in offshore structure for contract design

  • Oh, Min-Jae;Roh, Myung-Il;Park, Sung-Woo;Chun, Do-Hyun;Myung, Sehyun
    • Ocean Systems Engineering
    • /
    • v.9 no.1
    • /
    • pp.79-95
    • /
    • 2019
  • The material analysis of an offshore structure is generally conducted in the contract design phase for the price quotation of a new offshore project. This analysis is conducted manually by an engineer, which is time-consuming and can lead to inaccurate results, because the data size from previous projects is too large, and there are so many materials to consider. In this study, the piping materials in an offshore structure are analyzed for contract design using a big data framework. The big data technologies used include HDFS (Hadoop Distributed File System) for data saving, Hive and HBase for the database to handle the saved data, Spark and Kylin for data processing, and Zeppelin for user interface and visualization. The analyzed results show that the proposed big data framework can reduce the efforts put toward contract design in the estimation of the piping material cost.

A Study on the Sequential Regenerative Simulation (순차적인 재생적 시뮬레이션에 관한 연구)

  • JongSuk R.;HaeDuck J.
    • Journal of the Korea Society for Simulation
    • /
    • v.13 no.2
    • /
    • pp.23-34
    • /
    • 2004
  • Regenerative simulation (RS) is a method of stochastic steady-state simulation in which output data are collected and analysed within regenerative cycles (RCs). Since data collected during consecutive RCs are independent and identically distributed, there is no problem with the initial transient period in simulated processes, which is a perennial issue of concern in all other types of steady-state simulation. In this paper, we address the issue of experimental analysis of the quality of sequential regenerative simulation in the sense of the coverage of the final confidence intervals of mean values. The ultimate purpose of this study is to determine the best version of RS to be implemented in Akaroa2 [1], a fully automated controller of distributed stochastic simulation in LAN environments.

  • PDF