• 제목/요약/키워드: Distributed Data Analysis

검색결과 2,340건 처리시간 0.033초

The Distribution Analysis of PM10 in Seoul Using Spatial Interpolation Methods (공간보간기법에 의한 서울시 미세먼지(PM10)의 분포 분석)

  • Cho, Hong-Lae;Jeong, Jong-Chul
    • Journal of Environmental Impact Assessment
    • /
    • 제18권1호
    • /
    • pp.31-39
    • /
    • 2009
  • A lot of data which are used in environment analysis of air pollution have characteristics that are distributed continuously in space. In this point, the collected data value such as precipitation, temperature, altitude, pollution density, PM10 have spatial aspect. When geostatistical data analysis are needed, acquisition of the value in every point is the best way, however, it is impossible because of the costs and time. Therefore, it is necessary to estimate the unknown values at unsampled locations based on observations. In this study, spatial interpolation method such as local trend surface model, IDW(inverse distance weighted), RBF(radial basis function), Kriging were applied to PM10 annual average concentration of Seoul in 2005 and the accuracy was evaluated. For evaluation of interpolation accuracy, range of estimated value, RMSE, average error were analyzed with observation data. The Kriging and RBF methods had the higher accuracy than others.

Regional Analysis of Load Loss in Power Distribution Lines Based on Smartgrid Big Data (스마트그리드 빅데이터 기반 지역별 배전선로 부하손실 분석)

  • Jae-Hun, Cho;Hae-Sung, Lee;Han-Min, Lim;Byung-Sung, Lee;Chae-Joo, Moon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • 제17권6호
    • /
    • pp.1013-1024
    • /
    • 2022
  • In addition to the assessment measure of electric quality levels, load loss are also a factor in hindering the financial profits of electrical sales companies. Therefore, accurate analysis of load losses generated from distributed power networks is very important. The accurate calculation of load losses in the distribution line has been carried out for a long time in many research institutes as well as power utilities around the world. But it is increasingly difficult to calculate the exact amount of loss due to the increase in the congestion of distribution power network due to the linkage of distributed energy resources(DER). In this paper, we develop smart grid big data infrastructure in order to accurately analyze the load loss of the distribution power network due to the connection of DERs. Through the preprocess of data selected from the smart grid big data, we develop a load loss analysis model that eliminated 'veracity' which is one of the characteristics of smart grid big data. Our analysis results can be used for facility investment plans or network operation plans to maintain stable supply reliability and power quality.

A Comparative Analysis of Recursive Query Algorithm Implementations based on High Performance Distributed In-Memory Big Data Processing Platforms (대용량 데이터 처리를 위한 고속 분산 인메모리 플랫폼 기반 재귀적 질의 알고리즘들의 구현 및 비교분석)

  • Kang, Minseo;Kim, Jaesung;Lee, Jaegil
    • Journal of KIISE
    • /
    • 제43권6호
    • /
    • pp.621-626
    • /
    • 2016
  • Recursive query algorithm is used in many social network services, e.g., reachability queries in social networks. Recently, the size of social network data has increased as social network services evolve. As a result, it is almost impossible to use the recursive query algorithm on a single machine. In this paper, we implement recursive query on two popular in-memory distributed platforms, Spark and Twister, to solve this problem. We evaluate the performance of two implementations using 50 machines on Amazon EC2, and real-world data sets: LiveJournal and ClueWeb. The result shows that recursive query algorithm shows better performance on Spark for the Livejournal input data set with relatively high average degree, but smaller vertices. However, recursive query on Twister is superior to Spark for the ClueWeb input data set with relatively low average degree, but many vertices.

S-PARAFAC: Distributed Tensor Decomposition using Apache Spark (S-PARAFAC: 아파치 스파크를 이용한 분산 텐서 분해)

  • Yang, Hye-Kyung;Yong, Hwan-Seung
    • Journal of KIISE
    • /
    • 제45권3호
    • /
    • pp.280-287
    • /
    • 2018
  • Recently, the use of a recommendation system and tensor data analysis, which has high-dimensional data, is increasing, as they allow us to analyze the tensor and extract potential elements and patterns. However, due to the large size and complexity of the tensor, it needs to be decomposed in order to analyze the tensor data. While several tools are used for tensor decomposition such as rTensor, pyTensor, and MATLAB, since such tools run on a single machine, they are unable to handle large data. Also, while distributed tensor decomposition tools based on Hadoop can handle a scalable tensor, its computing speed is too slow. In this paper, we propose S-PARAFAC, which is a tensor decomposition tool based on Apache Spark, in distributed in-memory environments. We converted the PARAFAC algorithm into an Apache Spark version that enables rapid processing of tensor data. We also compared the performance of the Hadoop based tensor tool and S-PARAFAC. The result showed that S-PARAFAC is approximately 4~25 times faster than the Hadoop based tensor tool.

Analysis of Vegetation and Vegetation-Environment Relationships in Main Wild Vegetables of Ulleungdo in Korea -Vegetation of herb layer of the Aster glehni, Allium ochotense, and Aruncus sylvester - (울릉도 주요 산채류 자생지의 식생 및 환경과의 상관관계 분석 -섬쑥부쟁이, 울릉산마늘, 눈개승마의 초본층 식생을 중심으로-)

  • Lee, Joong-Ku;Kim, Hyoun-Sook;Lee, Sang-Myong;Park, Gwan-Soo
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • 제21권6호
    • /
    • pp.71-82
    • /
    • 2018
  • This study was conducted to provide ecological basic data that use to establish environmental conditions for cultivation of wild vegetables in 2016-2018. Therefore, we investigated the vegetation structure and the correlation between the community structure and the environmental factors for natural habitats of wild vegetables(Aster glehni, Allium ochotense, and Aruncus sylvester) distributed in Ulleungdo. As a result of population and gradient analysis, the vegetation was classified into Aster glehni community, Allium ochotense community, and Aruncus sylvester community. We confirmed that the classification by population analysis was consistent with that by TWINSPAN method, suggesting that they were complemented each other. The importance value of Aster glehni was the highest in all communities, followed by Aruncus sylvester, Allium ochotense, Hydrangea petiolaris, Dryopteris crassirhizoma, Asperula ldorata, Phryma leptostachya var. asiatica, Disporum viridrescens, Hedera rhombea, Anthriscus sylvestris, and Hepatica maxima. According to the results of DCCA ordination analysis, among those communities, the Aster glehni community was distributed in soil where the nutrition including T-N and O.M. were intermediate. The Allium ochotense community was distributed on the a little high northern slope at the highest altitude where the CEC and O.M. were the highest, and other nutrition and pH were low. The Aruncus sylvester was distributed on high slope and altitude on which the amount of exchangeable cation such as $Ca^{{+}{+}}$, $Mg^{{+}{+}}$ and pH were high, and the CEC, $P_2O_5$, and O.M. were the lowest.

Development of SWAT SD-HRU Pre-processor Module for Accurate Estimation of Slope and Slope Length of Each HRU Considering Spatial Topographic Characteristics in SWAT (SWAT HRU 단위의 경사도/경사장 산정을 위한 SWAT SD-HRU 전처리 프로세서 모듈 개발)

  • Jang, Wonseok;Yoo, Dongsun;Chung, Il-moon;Kim, Namwon;Jun, Mansig;Park, Younshik;Kim, Jonggun;Lim, Kyoung-Jae
    • Journal of Korean Society on Water Environment
    • /
    • 제25권3호
    • /
    • pp.351-362
    • /
    • 2009
  • The Soil and Water Assessment Tool (SWAT) model, semi-distributed model, first divides the watershed into multiple subwatersheds, and then extracts the basic computation element, called the Hydrologic Response Unit (HRU). In the process of HRU generation, the spatial information of land use and soil maps within each subwatershed is lost. The SWAT model estimates the HRU topographic data based on the average slope of each subwatershed, and then use this topographic datum for all HRUs within the subwatershed. To improve the SWAT capabilities for various watershed scenarios, the Spatially Distributed-HRU (SD-HRU) pre-processor module was developed in this study to simulate site-specific topographic data. The SD-HRU was applied to the Hae-an watershed, where field slope lengths and slopes are measured for all agricultural fields. The analysis revealed that the SD-HRU pre-processor module needs to be applied in SWAT sediment simulation for accurate analysis of soil erosion and sediment behaviors. If the SD-HRU pre-processor module is not applied in SWAT runs, the other SWAT factors may be over or under estimated, resulting in errors in physical and empirical computation modules although the SWAT estimated flow and sediment values match the measured data reasonably well.

A Study on the Food Service Selection Attributes and Consumption Behaviors based on Lifestyle Market Segments: Empirical Evidences from Luoyang (라이프스타일에 따른 세분시장별 외식 선택속성과 소비행동에 관한 연구: 중국 낙양지역을 대상으로)

  • Yao, Liang;Kim, Dong-Jin
    • Culinary science and hospitality research
    • /
    • 제23권3호
    • /
    • pp.111-122
    • /
    • 2017
  • The purpose of this study was to examine the market segments of Chinese dining-out customers based on their lifestyle. This study focused on the selection and consumption behavior of dining-out customers. The subjects of this study were 20 years old or older diners in Luoyang, China, and the data were collected for 11 days from April 5, 2016. 400 questionnaires were distributed, and 390 copies were collected. After excluding 9 inadequate questionnaires, 381 responses were used for data analysis by using IBM SPSS 23.0, and Data analysis included frequency analysis, cluster analysis, one-way ANOVA, and cross tabulation. The results of empirical analysis showed that there was a significant difference in selection attributes, consumption behavior and demographic characteristics in terms of lifestyle market segments.

Design and Implementation of a Web-based Expert System for the Total Quality Management (종합적 품질경영을 위한 웹 기반 분산형 전문가시스템의 설계 및 구축)

  • 김성인;조정용
    • Journal of Korean Society for Quality Management
    • /
    • 제32권2호
    • /
    • pp.168-190
    • /
    • 2004
  • In these days of world-wide business environment, the characteristics of quality management are variety, specialty, decentralization, totality, etc. Thus nowadays quality management is demanded to incorporate these new concepts. We propose a web-bused distributed expert system for this purpose. The system consists of four expert systems for design of experiment, acceptance inspection, statistical process control and reliability management corresponding to design quality, incoming-material quality, manufacturing quality and usability quality, respectively, throughout the total product life cycle. Each distributed expert system at the horizontal level in the hierarchy carries out its own quality jobs independently. At the lower level in the hierarchy there is an expert system for measurement analysis to provide reliable data, and at the upper level, an expert system for total quality management to coordinate, integrate and make final decisions. A prototype has been developed and its application is presented.

MTTDL for Distributed Storage Systems with Dual Node Repair Capability (이중 노드 복구가 가능한 분산 저장 시스템의 MTTDL)

  • Kil, Yong Sung;Kim, Sang-Hyo;Park, Hosung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • 제42권2호
    • /
    • pp.345-348
    • /
    • 2017
  • MTTDL, a measure for reliability of distributed storage system, is analyzed for the case when double node repair is possible and compared with the single node repair cases.

Alien Hitchhiker Insect Species Detected from International Vessels Entering Korea in 2022

  • Tae Hwa Kang;Sang Woong Kim;Deuk-Soo Choi
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • 제5권2호
    • /
    • pp.60-67
    • /
    • 2024
  • Hitchhiker insect species from international vessels entering Korea in 2022 were monitored. A total of 947 samples of hitchhiker insects were collected using a simple collection method by hand. Among them, 856 individuals were classified as 374 species of 86 families in 10 orders through integrative analysis with DNA barcoding and morphological examination. The rest 91 individuals were identified only to the family level. As a result of examining the distribution of the 374 species (856 individuals), 38 species (71 individuals) were confirmed as not-distributed species in Korea, including six species (11 individuals) as 'regulated species' listed by the Korean Animal and Plant Quarantine Agency. Of 38 not-distributed species, 10 species were detected multiple times (at least twice). Accordingly, it is necessary to strengthen monitoring of the area around the port of entry along with continuous surveillance to prevent invasion of species detected multiple times. For monitoring alien hitchhiker insect species, this study provided detection information and biological data for alien species.