• Title/Summary/Keyword: Spatial random forest

Search Result 101, Processing Time 0.029 seconds

Evaluation and Predicting PM10 Concentration Using Multiple Linear Regression and Machine Learning (다중선형회귀와 기계학습 모델을 이용한 PM10 농도 예측 및 평가)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_3
    • /
    • pp.1711-1720
    • /
    • 2020
  • Particulate matter (PM) that has been artificially generated during the recent of rapid industrialization and urbanization moves and disperses according to weather conditions, and adversely affects the human skin and respiratory systems. The purpose of this study is to predict the PM10 concentration in Seoul using meteorological factors as input dataset for multiple linear regression (MLR), support vector machine (SVM), and random forest (RF) models, and compared and evaluated the performance of the models. First, the PM10 concentration data obtained at 39 air quality monitoring sites (AQMS) in Seoul were divided into training and validation dataset (8:2 ratio). The nine meteorological factors (mean, maximum, and minimum temperature, precipitation, average and maximum wind speed, wind direction, yellow dust, and relative humidity), obtained by the automatic weather system (AWS), were composed to input dataset of models. The coefficients of determination (R2) between the observed PM10 concentration and that predicted by the MLR, SVM, and RF models was 0.260, 0.772, and 0.793, respectively, and the RF model best predicted the PM10 concentration. Among the AQMS used for model validation, Gwanak-gu and Gangnam-daero AQMS are relatively close to AWS, and the SVM and RF models were highly accurate according to the model validations. The Jongno-gu AQMS is relatively far from the AWS, but since PM10 concentration for the two adjacent AQMS were used for model training, both models presented high accuracy. By contrast, Yongsan-gu AQMS was relatively far from AQMS and AWS, both models performed poorly.

Real-time flood prediction applying random forest regression model in urban areas (랜덤포레스트 회귀모형을 적용한 도시지역에서의 실시간 침수 예측)

  • Kim, Hyun Il;Lee, Yeon Su;Kim, Byunghyun
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.spc1
    • /
    • pp.1119-1130
    • /
    • 2021
  • Urban flooding caused by localized heavy rainfall with unstable climate is constantly occurring, but a system that can predict spatial flood information with weather forecast has not been prepared yet. The worst flood situation in urban area can be occurred with difficulties of structural measures such as river levees, discharge capacity of urban sewage, storage basin of storm water, and pump facilities. However, identifying in advance the spatial flood information can have a decisive effect on minimizing flood damage. Therefore, this study presents a methodology that can predict the urban flood map in real-time by using rainfall data of the Korea Meteorological Administration (KMA), the results of two-dimensional flood analysis and random forest (RF) regression model. The Ujeong district in Ulsan metropolitan city, which the flood is frequently occurred, was selected for the study area. The RF regression model predicted the flood map corresponding to the 50 mm, 80 mm, and 110 mm rainfall events with 6-hours duration. And, the predicted results showed 63%, 80%, and 67% goodness of fit compared to the results of two-dimensional flood analysis model. It is judged that the suggested results of this study can be utilized as basic data for evacuation and response to urban flooding that occurs suddenly.

Machine Learning based Seismic Response Prediction Methods for Steel Frame Structures (기계학습 기반 강 구조물 지진응답 예측기법)

  • Lee, Seunghye;Lee, Jaehong
    • Journal of Korean Association for Spatial Structures
    • /
    • v.24 no.2
    • /
    • pp.91-99
    • /
    • 2024
  • In this paper, machine learning models were applied to predict the seismic response of steel frame structures. Both geometric and material nonlinearities were considered in the structural analysis, and nonlinear inelastic dynamic analysis was performed. The ground acceleration response of the El Centro earthquake was applied to obtain the displacement of the top floor, which was used as the dataset for the machine learning methods. Learning was performed using two methods: Decision Tree and Random Forest, and their efficiency was demonstrated through application to 2-story and 6-story 3-D steel frame structure examples.

Camouflage Pattern Evaluation based on Environment and Camouflage Pattern Similarity Analysis (작전환경 및 위장무늬 유사도 분석 기반 위장무늬 평가)

  • Yun, Jeongrok;Kim, Hoemin;Kim, Un Yong;Chun, Sungkuk
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.671-672
    • /
    • 2021
  • 본 논문에서는 작전환경과 위장무늬 디자인 영상 간의 색상 및 구조 분석 기반의 새로운 정량적 위장무늬 평가 방법을 제안한다. 작전환경 및 위장무늬 디자인 영상 간 RGB, Lab 색상 공간에서의 화소간 평균 오차 및 색상 히스토그램 비교를 통해 색상 유사도를 계산한다. 또한, PSNR(Peak Signal-to-Noise Ratio), MSSIM(Mean Structural Similarity Index), UIQI, GMSD 및 딥러닝 기반 영상 간 구조 유사도를 계산한다. Random Forest Regressor를 통해 각각 계산된 색상 및 구조 유사도 파라미터를 회기 분석하여 최종 위장무늬 평가 결과를 계산한다. 20명의 피실험자를 대상으로 제안한 위장무늬 평가 방법과 기존 평가 방법을 비교함을 통해 제안한 방법의 성능을 검증하였다.

  • PDF

Applications of Machine Learning Models for the Estimation of Reservoir CO2 Emissions (저수지 CO2 배출량 산정을 위한 기계학습 모델의 적용)

  • Yoo, Jisu;Chung, Se-Woong;Park, Hyung-Seok
    • Journal of Korean Society on Water Environment
    • /
    • v.33 no.3
    • /
    • pp.326-333
    • /
    • 2017
  • The lakes and reservoirs have been reported as important sources of carbon emissions to the atmosphere in many countries. Although field experiments and theoretical investigations based on the fundamental gas exchange theory have proposed the quantitative amounts of Net Atmospheric Flux (NAF) in various climate regions, there are still large uncertainties at the global scale estimation. Mechanistic models can be used for understanding and estimating the temporal and spatial variations of the NAFs considering complicated hydrodynamic and biogeochemical processes in a reservoir, but these models require extensive and expensive datasets and model parameters. On the other hand, data driven machine learning (ML) algorithms are likely to be alternative tools to estimate the NAFs in responding to independent environmental variables. The objective of this study was to develop random forest (RF) and multi-layer artificial neural network (ANN) models for the estimation of the daily $CO_2$ NAFs in Daecheong Reservoir located in Geum River of Korea, and compare the models performance against the multiple linear regression (MLR) model that proposed in the previous study (Chung et al., 2016). As a result, the RF and ANN models showed much enhanced performance in the estimation of the high NAF values, while MLR model significantly under estimated them. Across validation with 10-fold random samplings was applied to evaluate the performance of three models, and indicated that the ANN model is best, and followed by RF and MLR models.

The use of MODIS atmospheric products to estimate cooling degree days at weather stations in South and North Korea (MODIS 대기자료를 활용한 남북한 기상관측소에서의 냉방도일 추정)

  • Yoo, Byoung Hyun;Kim, Kwang Soo;Lee, Jihye
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.2
    • /
    • pp.97-109
    • /
    • 2019
  • Degree days have been determined using temperature data measured at nearby weather stations to a site of interest to produce information for supporting decision-making on agricultural production. Alternatively, the data products of Moderate Resolution Imaging Spectroradiometer (MODIS) can be used for estimation of degree days in a given region, e.g., Korean Peninsula. The objective of this study was to develop a simple tool for processing the MODIS product for estimating cooling degree days (CDD), which would help assessment of heat stress conditions for a crop as well as energy requirement for greenhouses. A set of scripts written in R was implemented to obtain temperature profile data for the region of interest. These scripts had functionalities for processing spatial data, which include reprojection, mosaicking, and cropping. A module to extract air temperature at the surface pressure level was also developed using R extension packages such as rgdal and RcppArmadillo. Random forest (RF) models, which estimate mean temperature and CDD with a different set of MODIS data, were trained at 34 sites in South Korea during 2009 - 2018. Then, the values of CDD were calculated over Korean peninsula during the same period using those RF models. It was found that the CDD estimates using the MODIS data explained >74% of the variation in the CDD measurements at the weather stations in North Korea as well as South Korea. These results indicate that temperature data derived from the MODIS atmospheric products would be useful for reliable estimation of CDD. Our results also suggest that the MODIS data can be used for preparation of weather input data for other temperature-based agro-ecological models such as growing degree days or chill units.

Data Mining based Forest Fires Prediction Models using Meteorological Data (기상 데이터를 이용한 데이터 마이닝 기반의 산불 예측 모델)

  • Kim, Sam-Keun;Ahn, Jae-Geun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.8
    • /
    • pp.521-529
    • /
    • 2020
  • Forest fires are one of the most important environmental risks that have adverse effects on many aspects of life, such as the economy, environment, and health. The early detection, quick prediction, and rapid response of forest fires can play an essential role in saving property and life from forest fire risks. For the rapid discovery of forest fires, there is a method using meteorological data obtained from local sensors installed in each area by the Meteorological Agency. Meteorological conditions (e.g., temperature, wind) influence forest fires. This study evaluated a Data Mining (DM) approach to predict the burned area of forest fires. Five DM models, e.g., Stochastic Gradient Descent (SGD), Support Vector Machines (SVM), Decision Tree (DT), Random Forests (RF), and Deep Neural Network (DNN), and four feature selection setups (using spatial, temporal, and weather attributes), were tested on recent real-world data collected from Gyeonggi-do area over the last five years. As a result of the experiment, a DNN model using only meteorological data showed the best performance. The proposed model was more effective in predicting the burned area of small forest fires, which are more frequent. This knowledge derived from the proposed prediction model is particularly useful for improving firefighting resource management.

Comparative Assessment of Linear Regression and Machine Learning for Analyzing the Spatial Distribution of Ground-level NO2 Concentrations: A Case Study for Seoul, Korea (서울 지역 지상 NO2 농도 공간 분포 분석을 위한 회귀 모델 및 기계학습 기법 비교)

  • Kang, Eunjin;Yoo, Cheolhee;Shin, Yeji;Cho, Dongjin;Im, Jungho
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_1
    • /
    • pp.1739-1756
    • /
    • 2021
  • Atmospheric nitrogen dioxide (NO2) is mainly caused by anthropogenic emissions. It contributes to the formation of secondary pollutants and ozone through chemical reactions, and adversely affects human health. Although ground stations to monitor NO2 concentrations in real time are operated in Korea, they have a limitation that it is difficult to analyze the spatial distribution of NO2 concentrations, especially over the areas with no stations. Therefore, this study conducted a comparative experiment of spatial interpolation of NO2 concentrations based on two linear-regression methods(i.e., multi linear regression (MLR), and regression kriging (RK)), and two machine learning approaches (i.e., random forest (RF), and support vector regression (SVR)) for the year of 2020. Four approaches were compared using leave-one-out-cross validation (LOOCV). The daily LOOCV results showed that MLR, RK, and SVR produced the average daily index of agreement (IOA) of 0.57, which was higher than that of RF (0.50). The average daily normalized root mean square error of RK was 0.9483%, which was slightly lower than those of the other models. MLR, RK and SVR showed similar seasonal distribution patterns, and the dynamic range of the resultant NO2 concentrations from these three models was similar while that from RF was relatively small. The multivariate linear regression approaches are expected to be a promising method for spatial interpolation of ground-level NO2 concentrations and other parameters in urban areas.

Estimation of Chlorophyll-a Concentration in Nakdong River Using Machine Learning-Based Satellite Data and Water Quality, Hydrological, and Meteorological Factors (머신러닝 기반 위성영상과 수질·수문·기상 인자를 활용한 낙동강의 Chlorophyll-a 농도 추정)

  • Soryeon Park;Sanghun Son;Jaegu Bae;Doi Lee;Dongju Seo;Jinsoo Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.655-667
    • /
    • 2023
  • Algal bloom outbreaks are frequently reported around the world, and serious water pollution problems arise every year in Korea. It is necessary to protect the aquatic ecosystem through continuous management and rapid response. Many studies using satellite images are being conducted to estimate the concentration of chlorophyll-a (Chl-a), an indicator of algal bloom occurrence. However, machine learning models have recently been used because it is difficult to accurately calculate Chl-a due to the spectral characteristics and atmospheric correction errors that change depending on the water system. It is necessary to consider the factors affecting algal bloom as well as the satellite spectral index. Therefore, this study constructed a dataset by considering water quality, hydrological and meteorological factors, and sentinel-2 images in combination. Representative ensemble models random forest and extreme gradient boosting (XGBoost) were used to predict the concentration of Chl-a in eight weirs located on the Nakdong river over the past five years. R-squared score (R2), root mean square errors (RMSE), and mean absolute errors (MAE) were used as model evaluation indicators, and it was confirmed that R2 of XGBoost was 0.80, RMSE was 6.612, and MAE was 4.457. Shapley additive expansion analysis showed that water quality factors, suspended solids, biochemical oxygen demand, dissolved oxygen, and the band ratio using red edge bands were of high importance in both models. Various input data were confirmed to help improve model performance, and it seems that it can be applied to domestic and international algal bloom detection.

Analysis for Dispersal and Spatial Pattern of Metcalfa pruinosa (Hemiptera: Flatidae) in Southern Sweet Persimmon Orchard (남부지방 단감원에서 미국선녀벌레의 분산 및 공간분포 분석)

  • Park, Bueyong;Kim, Min-Jung;Lee, Sang-Ku;Kim, Gil-Hah
    • Korean journal of applied entomology
    • /
    • v.58 no.4
    • /
    • pp.291-297
    • /
    • 2019
  • Since Metcalfa pruinosa was first reported in Koera, it has continually caused damage to sweet persimmon orchard in southern part of Korea. Metcafa pruinosa exist not only in farmland but also in forest areas, and are difficult to control due to the influx of individuals from near forest. M. pruinosa has been occurred in orchard and its surroundings because of various host range. Thus, it has been difficult to decide spatial range and control time for efficient management. In this study, occurrence and dispersal pattern of M. pruinosa in persimmon orchard were surveyed using clear sticky traps, and spatial patterns were analyzed with SADIE(Spatial Analysis by Distance IndicEs), based on location information at sticky traps. Spatial association between survey time was also analyzed to identify when the spatial pattern changed. In sweet persimmon orchard, M. pruinosa mainly dispersed in mid to late May, when the first instar hatches, and in August, emerging season of adult. The first instar nymphs hatched in mid-May were randomly distributed in orchard, but distribution was changed to aggregative pattern after dispersed surroundings of orchard. Adults showed random distribution pattern after immigration to orchard again. These tendency was also observed in density change at orchard and its surroundings, and matched to actual density of M. pruinosa in sweet persimmon trees.