• Title/Summary/Keyword: Spatio-temporal Big Data

Search Result 18, Processing Time 0.024 seconds

Prediction of spatio-temporal AQI data

  • KyeongEun Kim;MiRu Ma;KyeongWon Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.2
    • /
    • pp.119-133
    • /
    • 2023
  • With the rapid growth of the economy and fossil fuel consumption, the concentration of air pollutants has increased significantly and the air pollution problem is no longer limited to small areas. We conduct statistical analysis with the actual data related to air quality that covers the entire of South Korea using R and Python. Some factors such as SO2, CO, O3, NO2, PM10, precipitation, wind speed, wind direction, vapor pressure, local pressure, sea level pressure, temperature, humidity, and others are used as covariates. The main goal of this paper is to predict air quality index (AQI) spatio-temporal data. The observations of spatio-temporal big datasets like AQI data are correlated both spatially and temporally, and computation of the prediction or forecasting with dependence structure is often infeasible. As such, the likelihood function based on the spatio-temporal model may be complicated and some special modelings are useful for statistically reliable predictions. In this paper, we propose several methods for this big spatio-temporal AQI data. First, random effects with spatio-temporal basis functions model, a classical statistical analysis, is proposed. Next, neural networks model, a deep learning method based on artificial neural networks, is applied. Finally, random forest model, a machine learning method that is closer to computational science, will be introduced. Then we compare the forecasting performance of each other in terms of predictive diagnostics. As a result of the analysis, all three methods predicted the normal level of PM2.5 well, but the performance seems to be poor at the extreme value.

Applications of Open-source Spatio-Temporal Database Systems in Wide-field Time-domain Astronomy

  • Chang, Seo-Won;Shin, Min-Su
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.41 no.2
    • /
    • pp.53.2-53.2
    • /
    • 2016
  • We present our experiences with open-source spatio-temporal database systems for managing and analyzing big astronomical data acquired by wide-field time-domain sky surveys. Considering performance, cost, difficulty, and scalability of the database systems, we conduct comparison studies of open-source spatio-temporal databases such as GeoMesa and PostGIS that are already being used for handling big geographical data. Our experiments include ingesting, indexing, and querying millions or billions of astronomical spatio-temporal data. We choose the public VVV (VISTA Variables in the Via Lactea) catalogs of billions measurements for hundreds of millions objects as the test data. We discuss issues of how these spatio-temporal database systems can be adopted in the astronomy community.

  • PDF

A Missing Value Replacement Method for Agricultural Meteorological Data Using Bayesian Spatio-Temporal Model (농업기상 결측치 보정을 위한 통계적 시공간모형)

  • Park, Dain;Yoon, Sanghoo
    • Journal of Environmental Science International
    • /
    • v.27 no.7
    • /
    • pp.499-507
    • /
    • 2018
  • Agricultural meteorological information is an important resource that affects farmers' income, food security, and agricultural conditions. Thus, such data are used in various fields that are responsible for planning, enforcing, and evaluating agricultural policies. The meteorological information obtained from automatic weather observation systems operated by rural development agencies contains missing values owing to temporary mechanical or communication deficiencies. It is known that missing values lead to reduction in the reliability and validity of the model. In this study, the hierarchical Bayesian spatio-temporal model suggests replacements for missing values because the meteorological information includes spatio-temporal correlation. The prior distribution is very important in the Bayesian approach. However, we found a problem where the spatial decay parameter was not converged through the trace plot. A suitable spatial decay parameter, estimated on the bias of root-mean-square error (RMSE), which was determined to be the difference between the predicted and observed values. The latitude, longitude, and altitude were considered as covariates. The estimated spatial decay parameters were 0.041 and 0.039, for the spatio-temporal model with latitude and longitude and for latitude, longitude, and altitude, respectively. The posterior distributions were stable after the spatial decay parameter was fixed. root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and bias were calculated for model validation. Finally, the missing values were generated using the independent Gaussian process model.

Dynamic Load Management Method for Spatial Data Stream Processing on MapReduce Online Frameworks (맵리듀스 온라인 프레임워크에서 공간 데이터 스트림 처리를 위한 동적 부하 관리 기법)

  • Jeong, Weonil
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.8
    • /
    • pp.535-544
    • /
    • 2018
  • As the spread of mobile devices equipped with various sensors and high-quality wireless network communications functionsexpands, the amount of spatio-temporal data generated from mobile devices in various service fields is rapidly increasing. In conventional research into processing a large amount of real-time spatio-temporal streams, it is very difficult to apply a Hadoop-based spatial big data system, designed to be a batch processing platform, to a real-time service for spatio-temporal data streams. This paper extends the MapReduce online framework to support real-time query processing for continuous-input, spatio-temporal data streams, and proposes a load management method to distribute overloads for efficient query processing. The proposed scheme shows a dynamic load balancing method for the nodes based on the inflow rate and the load factor of the input data based on the space partition. Experiments show that it is possible to support efficient query processing by distributing the spatial data stream in the corresponding area to the shared resources when load management in a specific area is required.

A Suggestion for Spatiotemporal Analysis Model of Complaints on Officially Assessed Land Price by Big Data Mining (빅데이터 마이닝에 의한 공시지가 민원의 시공간적 분석모델 제시)

  • Cho, Tae In;Choi, Byoung Gil;Na, Young Woo;Moon, Young Seob;Kim, Se Hun
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.2
    • /
    • pp.79-98
    • /
    • 2018
  • The purpose of this study is to suggest a model analysing spatio-temporal characteristics of the civil complaints for the officially assessed land price based on big data mining. Specifically, in this study, the underlying reasons for the civil complaints were found from the spatio-temporal perspectives, rather than the institutional factors, and a model was suggested monitoring a trend of the occurrence of such complaints. The official documents of 6,481 civil complaints for the officially assessed land price in the district of Jung-gu of Incheon Metropolitan City over the period from 2006 to 2015 along with their temporal and spatial poperties were collected and used for the analysis. Frequencies of major key words were examined by using a text mining method. Correlations among mafor key words were studied through the social network analysis. By calculating term frequency(TF) and term frequency-inverse document frequency(TF-IDF), which correspond to the weighted value of key words, I identified the major key words for the occurrence of the civil complaint for the officially assessed land price. Then the spatio-temporal characteristics of the civil complaints were examined by analysing hot spot based on the statistics of Getis-Ord $Gi^*$. It was found that the characteristic of civil complaints for the officially assessed land price were changing, forming a cluster that is linked spatio-temporally. Using text mining and social network analysis method, we could find out that the occurrence reason of civil complaints for the officially assessed land price could be identified quantitatively based on natural language. TF and TF-IDF, the weighted averages of key words, can be used as main explanatory variables to analyze spatio-temporal characteristics of civil complaints for the officially assessed land price since these statistics are different over time across different regions.

Development of Traffic Speed Prediction Model Reflecting Spatio-temporal Impact based on Deep Neural Network (시공간적 영향력을 반영한 딥러닝 기반의 통행속도 예측 모형 개발)

  • Kim, Youngchan;Kim, Junwon;Han, Yohee;Kim, Jongjun;Hwang, Jewoong
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.1
    • /
    • pp.1-16
    • /
    • 2020
  • With the advent of the fourth industrial revolution era, there has been a growing interest in deep learning using big data, and studies using deep learning have been actively conducted in various fields. In the transportation sector, there are many advantages to using deep learning in research as much as using deep traffic big data. In this study, a short -term travel speed prediction model using LSTM, a deep learning technique, was constructed to predict the travel speed. The LSTM model suitable for time series prediction was selected considering that the travel speed data, which is used for prediction, is time series data. In order to predict the travel speed more precisely, we constructed a model that reflects both temporal and spatial effects. The model is a short-term prediction model that predicts after one hour. For the analysis data, the 5minute travel speed collected from the Seoul Transportation Information Center was used, and the analysis section was selected as a part of Gangnam where traffic was congested.

Spatial Characteristics and Driving Forces of Cultivated Land Changes by Coupling Spatial Autocorrelation Model and Spatial-temporal Big Data

  • Hua, Wang;Yuxin, Zhu;Mengyu, Wang;Jiqiang, Niu;Xueye, Chen;Yang, Zhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.2
    • /
    • pp.767-785
    • /
    • 2021
  • With the rapid development of information technology, it is now possible to analyze the spatial patterns of cultivated land and its evolution by combining GIS, geostatistical analysis models and spatiotemporal big data for the dynamic monitoring and management of cultivated land resources. The spatial pattern of cultivated land and its evolutionary patterns in Luoyang City, China from 2009 to 2019 were analyzed using spatial autocorrelation and spatial autoregressive models on the basis of GIS technology. It was found that: (1) the area of cultivated land in Luoyang decreased then increased between 2009 and 2019, with an overall increase of 0.43% in 2019 compared to 2009, with cultivated land being dominant in the overall landscape of Luoyang; (2) cultivated land holdings in Luoyang are highly spatially autocorrelated, with the 'high-high'-type area being concentrated in the border area directly north and northeast of Luoyang, while the 'low-low'-type area is concentrated in the south and in the municipal area of Luoyang, and being heavily influenced by topography and urbanization. The expansion determined during the study period mainly took place in the Luoyang City, with most of it being transferred from the 'high-low'-type area; (3) elevation, slope and industrial output values from analysis of the bivariate spatial autocorrelation and spatial autoregressive models of the drivers all had significant effects on the amount of cultivated land holdings, with elevation having a positive effect, and slope and industrial output having a negative effect.

Personalized Book Curation System based on Integrated Mining of Book Details and Body Texts (도서 정보 및 본문 텍스트 통합 마이닝 기반 사용자 맞춤형 도서 큐레이션 시스템)

  • Ahn, Hee-Jeong;Kim, Kee-Won;Kim, Seung-Hoon
    • Journal of Information Technology Applications and Management
    • /
    • v.24 no.1
    • /
    • pp.33-43
    • /
    • 2017
  • The content curation service through big data analysis is receiving great attention in various content fields, such as film, game, music, and book. This service recommends personalized contents to the corresponding user based on user's preferences. The existing book curation systems recommended books to users by using bibliographic citation, user profile or user log data. However, these systems are difficult to recommend books related to character names or spatio-temporal information in text contents. Therefore, in this paper, we suggest a personalized book curation system based on integrated mining of a book. The proposed system consists of mining system, recommendation system, and visualization system. The mining system analyzes book text, user information or profile, and SNS data. The recommendation system recommends personalized books for users based on the analysed data in the mining system. This system can recommend related books using based on book keywords even if there is no user information like new customer. The visualization system visualizes book bibliographic information, mining data such as keyword, characters, character relations, and book recommendation results. In addition, this paper also includes the design and implementation of the proposed mining and recommendation module in the system. The proposed system is expected to broaden users' selection of books and encourage balanced consumption of book contents.

Spatio-Temporal Patterns of a Public Bike Sharing System in Seoul - Focusing on Yeouido District - (서울시 공공자전거 공유시스템(PBSS)의 시공간적 이용 패턴 분석 - 서울시 여의도동을 중심으로 -)

  • Yun, Seung-yong;Min, Kyung-hun;Ko, Ha-jung
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.48 no.1
    • /
    • pp.1-14
    • /
    • 2020
  • Various policies and studies regarding use of PBSS (Public Bike Sharing System) and Programs (PBSP) have been conducted worldwide as the number systems or programs has increased. Although various phenomena and demands have been generated by the use of PBSS in everyday life, the majority of research and the policies in South Korea have been implemented focused on commuting life. The purpose of this study aimed to understand various PBSS demands using PBSS usage data in 2018 in the Yeouido districts through classifying usage patterns and analyzing features. The rental stations were classified into three types based on weekday/weekend usage rates. The usage of Yeouido's PBSS accounted for 4.3% of the total usage in Seoul Metropolitan City, while the number of PBSS rental stations accounted for 2% of all rental stations in the Seoul urban areas. Rental stations with a higher weekday utilization rates showed high utilization rates in all four seasons and were mainly distributed in work and residential areas. Other stations showed a concentrated usage pattern in spring (April-May) and autumn (September-October) seasons, and their locations were close to the entrance of nearby parks. Besides, renting and returning were often concentrated at certain rental stations for high weekend utilization as compared to the pattern of high weekday usage. Therefore, PBSS management and programs should be operated to reflect various usage demands rather than uniform PBSS operations. The result of this study is meaningful to provide basic data for effective PBSS operation by monitoring the demand for PBSS usage in spatio-temporal terms.

Update Frequency Reducing Method of Spatio-Temporal Big Data based on MapReduce (MapReduce와 시공간 데이터를 이용한 빅 데이터 크기의 이동객체 갱신 횟수 감소 기법)

  • Choi, Youn-Gwon;Baek, Sung-Ha;Kim, Gyung-Bae;Bae, Hae-Young
    • Spatial Information Research
    • /
    • v.20 no.2
    • /
    • pp.137-153
    • /
    • 2012
  • Until now, many indexing methods that can reduce update cost have been proposed for managing massive moving objects. Because indexing methods for moving objects have to be updated periodically for managing moving objects that change their location data frequently. However these kinds indexing methods occur big load that exceed system capacity when the number of moving objects increase dramatically. In this paper, we propose the update frequency reducing method to combine MapReduce and existing indices. We use the update request grouping method for each moving object by using MapReduce. We decide to update by comparing the latest data and the oldest data in grouping data. We reduce update frequency by updating the latest data only. When update is delayed, for the data should not be lost and updated periodically, we store the data in a certain period of time in the hash table that keep previous update data. By the performance evaluation, we can prove that the proposed method reduces the update frequency by comparison with methods that are not applied the proposed method.