• Title/Summary/Keyword: Distributed Data Analysis

Search Result 2,340, Processing Time 0.029 seconds

A Study on Efficient Building Energy Management System Based on Big Data

  • Chang, Young-Hyun;Ko, Chang-Bae
    • International journal of advanced smart convergence
    • /
    • v.8 no.1
    • /
    • pp.82-86
    • /
    • 2019
  • We aim to use public data different from the remote BEMS energy diagnostics technology and already established and then switch the conventional operation environment to a big-data-based integrated management environment to operate and build a building energy management environment of maximized efficiency. In Step 1, various network management environments of the system integrated with a big data platform and the BEMS management system are used to collect logs created in various types of data by means of the big data platform. In Step 2, the collected data are stored in the HDFS (Hadoop Distributed File System) to manage the data in real time about internal and external changes on the basis of integration analysis, for example, relations and interrelation for automatic efficient management.

An Analysis of the House Purchasing Behavior According to the Housing Value and the Life-Style (주거가치와 주생활양식에 따른 주택구매행동 분석)

  • 고경필
    • Journal of the Korean housing association
    • /
    • v.5 no.2
    • /
    • pp.65-75
    • /
    • 1994
  • The purpose of this study was analyzed that the house purchasing behavior had on influence on the housing value and the life-style factors. For this purpose, the data were collected by using questionnaire distributed to 251. The data were analyzed by Factor Analysis, Pearson's Correlation Analysis and Multiple Regression Analysis. The major findings of this research were as follow: 1. The housing value factors were classified into condition of location, safety, esthetic, economic and prestige and human relation and approach. The housing life-style factors were classified into ostentation. 2. The house purchasing behavior were correlated with the housing value and the life-style factors. 3. The house purchasing behavior had an influence on the housing value and the life-style factors.

  • PDF

Design and Implementation of Big Data Analytics Framework for Disaster Risk Assessment (빅데이터 기반 재난 재해 위험도 분석 프레임워크 설계 및 구현)

  • Chai, Su-seong;Jang, Sun Yeon;Suh, Dongjun
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.771-777
    • /
    • 2018
  • This study proposes a big data based risk analysis framework to analyze more comprehensive disaster risk and vulnerability. We introduce a distributed and parallel framework that allows large volumes of data to be processed in a short time by using open-source disaster risk assessment tool. A performance analysis of the proposed system presents that it achieves a more faster processing time than that of the existing system and it will be possible to respond promptly to precise prediction and contribute to providing guideline to disaster countermeasures. Proposed system is able to support accurate risk prediction and mitigate severe damage, therefore will be crucial to giving decision makers or experts to prepare for emergency or disaster situation, and minimizing large scale damage to a region.

Classification of Tidal Flat Deposits in the Cheonsu-bay using Landsat TM Data and Surface Sediment Analysis (Landsat TM 자료와 표충퇴적물 분석을 통한 천수만 간석지 퇴적물 분류)

  • Jang, Dong-Ho;Chi, Kwang-Hoon;Lee, Hyoun-Young
    • Journal of Environmental Impact Assessment
    • /
    • v.11 no.4
    • /
    • pp.247-258
    • /
    • 2002
  • This study aimed at verifying the grain-sized distribution of surface deposits in a tidal flat using multi-spectral Landsat TM. In this study, we employed the grain-sized analysis, PCA and unsupervised classification techniques for analyzing the distribution of deposits. As a result in this study, the unsupervised classification method using PCA image was found to be most useful in classifying tidal flat deposits using satellite data. This method is considerably effective in analyzing not only the aspects of distribution in terms of accumulated deposits and erosion, but also the changes in seaside topography and shoreline. The grain-sized distribution analysis indicates that the mud flat inside the Cheonsu-bay tidal flat is distributed, the mixed flat located in the middle, and the sand flat distributed near the sea. The sand flat is dominant around the southern part of Seomot isle and its beach. On the other hand, the mud and mixed flat is dominant on the western part. Likewise, the western coast of Seomot isle and its beach is significantly affected by waves facing the offshore. However, the eastern side of the bay could be a site for the evolution of tidal flat made of fine materials where it is less affected by ocean waves. These results show that multi-spectral satellite data are effective for the classification of distribution materials and environmental impact assessment and continuous monitoring. In particular, the research on environmental deposits can provide important decision-supporting information for decision-making on seaside development, by analyzing the progress of deposits and environmental changes.

Long-term Statistical Analysis of the Simultaneity of Forbush Decrease Events at Middle Latitudes

  • Lee, Seongsuk;Oh, Suyeon;Yi, Yu;Evenson, Paul;Jee, Geonhwa;Choi, Hwajin
    • Journal of Astronomy and Space Sciences
    • /
    • v.32 no.1
    • /
    • pp.33-38
    • /
    • 2015
  • Forbush Decreases (FD) are transient, sudden reductions of cosmic ray (CR) intensity lasting a few days, to a week. Such events are observed globally using ground neutron monitors (NMs). Most studies of FD events indicate that an FD event is observed simultaneously at NM stations located all over the Earth. However, using statistical analysis, previous researchers verified that while FD events could occur simultaneously, in some cases, FD events could occur non-simultaneously. Previous studies confirmed the statistical reality of non-simultaneous FD events and the mechanism by which they occur, using data from high-latitude and middle-latitude NM stations. In this study, we used long-term data (1971-2006) from middle-latitude NM stations (Irkutsk, Climax, and Jungfraujoch) to enhance statistical reliability. According to the results from this analysis, the variation of cosmic ray intensity during the main phase, is larger (statistically significant) for simultaneous FD events, than for non-simultaneous ones. Moreover, the distribution of main-phase-onset time shows differences that are statistically significant. While the onset times for the simultaneous FDs are distributed evenly over 24-hour intervals (day and night), those of non-simultaneous FDs are mostly distributed over 12-hour intervals, in daytime. Thus, the existence of the two kinds of FD events, according to differences in their statistical properties, were verified based on data from middle-latitude NM stations.

Extreme Value Analysis of Statistically Independent Stochastic Variables

  • Choi, Yongho;Yeon, Seong Mo;Kim, Hyunjoe;Lee, Dongyeon
    • Journal of Ocean Engineering and Technology
    • /
    • v.33 no.3
    • /
    • pp.222-228
    • /
    • 2019
  • An extreme value analysis (EVA) is essential to obtain a design value for highly nonlinear variables such as long-term environmental data for wind and waves, and slamming or sloshing impact pressures. According to the extreme value theory (EVT), the extreme value distribution is derived by multiplying the initial cumulative distribution functions for independent and identically distributed (IID) random variables. However, in the position mooring of DNVGL, the sampled global maxima of the mooring line tension are assumed to be IID stochastic variables without checking their independence. The ITTC Recommended Procedures and Guidelines for Sloshing Model Tests never deal with the independence of the sampling data. Hence, a design value estimated without the IID check would be under- or over-estimated because of considering observations far away from a Weibull or generalized Pareto distribution (GPD) as outliers. In this study, the IID sampling data are first checked in an EVA. With no IID random variables, an automatic resampling scheme is recommended using the block maxima approach for a generalized extreme value (GEV) distribution and peaks-over-threshold (POT) approach for a GPD. A partial autocorrelation function (PACF) is used to check the IID variables. In this study, only one 5 h sample of sloshing test results was used for a feasibility study of the resampling IID variables approach. Based on this study, the resampling IID variables may reduce the number of outliers, and the statistically more appropriate design value could be achieved with independent samples.

Analysis of generalized progressive hybrid censored competing risks data

  • Lee, Kyeong-Jun;Lee, Jae-Ik;Park, Chan-Keun
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.40 no.2
    • /
    • pp.131-137
    • /
    • 2016
  • In reliability analysis, it is quite common for the failure of any individual or item to be attributable to more than one cause. Moreover, observed data are often censored. Recently, progressive hybrid censoring schemes have become quite popular in life-testing problems and reliability analysis. However, a limitation of the progressive hybrid censoring scheme is that it cannot be applied when few failures occur before time T. Therefore, generalized progressive hybrid censoring schemes have been introduced. In this article, we derive the likelihood inference of the unknown parameters under the assumptions that the lifetime distributions of different causes are independent and exponentially distributed. We obtain the maximum likelihood estimators of the unknown parameters in exact forms. Asymptotic confidence intervals are also proposed. Bayes estimates and credible intervals of the unknown parameters are obtained under the assumption of gamma priors on the unknown parameters. Different methods are compared using Monte Carlo simulations. One real data set is analyzed for illustrative purposes.

Frequency and Social Network Analysis of the Bible Data using Big Data Analytics Tools R (R을 이용한 성경 데이터의 빈도와 소셜 네트워크 분석)

  • Ban, ChaeHoon;Ha, JongSoo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.93-96
    • /
    • 2018
  • Big datatics technology that can store and analyze data and obtain new knowledge has been adjusted for importance in many fields of the society. Big data is emerging as an important problem in the field of information and communication technology, but the mind of continuous technology is rising. R, a tool that can analyze big data, is a language and environment that enables information analysis of statistical bases. In this thesis, we use this to analyze the Bible data. R is used to investigate the frequency of what text is distributed and analyze the Bible through analysis of social network.

  • PDF

Covid 19 News Data Analysis and Visualization

  • Hur, Tai-Sung;Hwang, In-Yong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.4
    • /
    • pp.37-43
    • /
    • 2022
  • In this paper, we calculate the word frequency by date and region using news data related to COVID-19 distributed for about 8 months from December 2019 to July 2020, and visualized the correlation with the current state data of COVID-19 patients using the results. News data was collected from Big Kids, a news big data system operated by the Korea Press Promotion Foundation. The visualization system proposed in this paper shows the news frequency of the selected region compared to the overall region, the key keyword of the selected region, the region of the main keyword, and the date change of the selected region. Through this visualization, the main keywords and trends of COVID-19 confirmed and infected people can be identified for previous events.

Analysis of Performance Requirement for Large-Scale InfiniBand-based DVSM System (대용량의 InfiniBand 기반 DVSM 시스템 구현을 위한 성능 요구 분석)

  • Cho, Myeong-Jin;Kim, Seon-Wook
    • The KIPS Transactions:PartA
    • /
    • v.14A no.4
    • /
    • pp.215-226
    • /
    • 2007
  • For past years, many distributed virtual shared-memory(DVSM) systems have been studied in order to develop a low-cost shared memory system with a fast interconnection network. But the DVSM needs a lot of data and control communication between distributed processing nodes in order to provide memory consistency in software, and this communication overhead significantly dominates the overall performance. In general, the communication overhead also increases as the number of processing nodes increase, so communication overhead is a very important performance factor for developing a large-scale DVSM system. In this paper, we study the performance scalability quantitatively and qualitatively for developing a large-scale DVSM system based on the next generation interconnection network, called the InfiniBand. Based on the study, we analyze a performance requirement of the next-coming interconnection network to be used for developing a performance-scalable DVSM system in the future.