• Title/Summary/Keyword: 랜덤 포레스트 회귀

Search Result 82, Processing Time 0.023 seconds

Predicting the Effects of Rooftop Greening and Evaluating CO2 Sequestration in Urban Heat Island Areas Using Satellite Imagery and Machine Learning (위성영상과 머신러닝 활용 도시열섬 지역 옥상녹화 효과 예측과 이산화탄소 흡수량 평가)

  • Minju Kim;Jeong U Park;Juhyeon Park;Jisoo Park;Chang-Uk Hyun
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.481-493
    • /
    • 2023
  • In high-density urban areas, the urban heat island effect increases urban temperatures, leading to negative impacts such as worsened air pollution, increased cooling energy consumption, and increased greenhouse gas emissions. In urban environments where it is difficult to secure additional green spaces, rooftop greening is an efficient greenhouse gas reduction strategy. In this study, we not only analyzed the current status of the urban heat island effect but also utilized high-resolution satellite data and spatial information to estimate the available rooftop greening area within the study area. We evaluated the mitigation effect of the urban heat island phenomenon and carbon sequestration capacity through temperature predictions resulting from rooftop greening. To achieve this, we utilized WorldView-2 satellite data to classify land cover in the urban heat island areas of Busan city. We developed a prediction model for temperature changes before and after rooftop greening using machine learning techniques. To assess the degree of urban heat island mitigation due to changes in rooftop greening areas, we constructed a temperature change prediction model with temperature as the dependent variable using the random forest technique. In this process, we built a multiple regression model to derive high-resolution land surface temperatures for training data using Google Earth Engine, combining Landsat-8 and Sentinel-2 satellite data. Additionally, we evaluated carbon sequestration based on rooftop greening areas using a carbon absorption capacity per plant. The results of this study suggest that the developed satellite-based urban heat island assessment and temperature change prediction technology using Random Forest models can be applied to urban heat island-vulnerable areas with potential for expansion.

A Case Study on Text Analysis Using Meal Kit Product Review Data (밀키트 제품 리뷰 데이터를 이용한 텍스트 분석 사례 연구)

  • Choi, Hyeseon;Yeon, Kyupil
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.5
    • /
    • pp.1-15
    • /
    • 2022
  • In this study, text analysis was performed on the mealkit product review data to identify factors affecting the evaluation of the mealkit product. The data used for the analysis were collected by scraping 334,498 reviews of mealkit products in Naver shopping site. After preprocessing the text data, wordclouds and sentiment analyses based on word frequency and normalized TF-IDF were performed. Logistic regression model was applied to predict the polarity of reviews on mealkit products. From the logistic regression models derived for each product category, the main factors that caused positive and negative emotions were identified. As a result, it was verified that text analysis can be a useful tool that provides a basis for maximizing positive factors for a specific category, menu, and material and removing negative risk factors when developing a mealkit product.

Study on water quality prediction in water treatment plants using AI techniques (AI 기법을 활용한 정수장 수질예측에 관한 연구)

  • Lee, Seungmin;Kang, Yujin;Song, Jinwoo;Kim, Juhwan;Kim, Hung Soo;Kim, Soojun
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.3
    • /
    • pp.151-164
    • /
    • 2024
  • In water treatment plants supplying potable water, the management of chlorine concentration in water treatment processes involving pre-chlorination or intermediate chlorination requires process control. To address this, research has been conducted on water quality prediction techniques utilizing AI technology. This study developed an AI-based predictive model for automating the process control of chlorine disinfection, targeting the prediction of residual chlorine concentration downstream of sedimentation basins in water treatment processes. The AI-based model, which learns from past water quality observation data to predict future water quality, offers a simpler and more efficient approach compared to complex physicochemical and biological water quality models. The model was tested by predicting the residual chlorine concentration downstream of the sedimentation basins at Plant, using multiple regression models and AI-based models like Random Forest and LSTM, and the results were compared. For optimal prediction of residual chlorine concentration, the input-output structure of the AI model included the residual chlorine concentration upstream of the sedimentation basin, turbidity, pH, water temperature, electrical conductivity, inflow of raw water, alkalinity, NH3, etc. as independent variables, and the desired residual chlorine concentration of the effluent from the sedimentation basin as the dependent variable. The independent variables were selected from observable data at the water treatment plant, which are influential on the residual chlorine concentration downstream of the sedimentation basin. The analysis showed that, for Plant, the model based on Random Forest had the lowest error compared to multiple regression models, neural network models, model trees, and other Random Forest models. The optimal predicted residual chlorine concentration downstream of the sedimentation basin presented in this study is expected to enable real-time control of chlorine dosing in previous treatment stages, thereby enhancing water treatment efficiency and reducing chemical costs.

Diagnosis Atherosclerosis Model Using Radiomics Approach in Carotid Vessel MRI (경동맥 혈관 MRI에서 라디오믹스를 이용한 동맥경화증 진단 모델)

  • Kim, Jong-hun;Park, Hyunjin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.289-290
    • /
    • 2022
  • Arteriosclerosis is a disease in which the carotid vessel wall becomes thick, and it is important to monitor the thickness of the vessel wall for diagnosis. In this study, we propose a model for extracting 324 radiomics features from carotid MRI images and diagnosing arteriosclerosis using machine learning techniques. We learned a total of four classification models: logistic regression, support vector machine, random forest, and XGBoost through radiomics features. XGBoost model, which showed the highest performance in 5-fold cross-validation, shows the results of accuracy 0.9023, sensitivity 0.9517, specificity 0.8035, AUC 0.8776.

  • PDF

Prediction Model of CNC Processing Defects Using Machine Learning (머신러닝을 이용한 CNC 가공 불량 발생 예측 모델)

  • Han, Yong Hee
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.2
    • /
    • pp.249-255
    • /
    • 2022
  • This study proposed an analysis framework for real-time prediction of CNC processing defects using machine learning-based models that are recently attracting attention as processing defect prediction methods, and applied it to CNC machines. Analysis shows that the XGBoost, CatBoost, and LightGBM models have the same best accuracy, precision, recall, F1 score, and AUC, of which the LightGBM model took the shortest execution time. This short run time has practical advantages such as reducing actual system deployment costs, reducing the probability of CNC machine damage due to rapid prediction of defects, and increasing overall CNC machine utilization, confirming that the LightGBM model is the most effective machine learning model for CNC machines with only basic sensors installed. In addition, it was confirmed that classification performance was maximized when an ensemble model consisting of LightGBM, ExtraTrees, k-Nearest Neighbors, and logistic regression models was applied in situations where there are no restrictions on execution time and computing power.

Analysis of suspended sediment mixing in a river confluence using UAV-based hyperspectral imagery (드론기반 초분광 영상을 활용한 하천 합류부 부유사 혼합 분석)

  • Kwon, Siyoon;Seo, Il Won;Lyu, Siwan
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.89-89
    • /
    • 2022
  • 하천 합류부에 지천이 유입되는 경우 복잡한 3차원적 흐름 구조를 발생시키고 이로 인해 유사혼합 및 지형 변화가 활발히 발생하게 된다. 특히, 하천 합류부에서 부유사 거동은 하천의 세굴과퇴적, 하천 지형 변화, 하천 생태계, 하천구조물 안정성 등에 직접적으로 영향을 미치기 때문에 이에 대한 정확한 분석이 하천 관리 및 재해 예방에 필수적인 요소이다. 기존의 하천 합류부 부유사 계측 자료들은 재래식 채취 방식으로 수행되어 시공간적 해상도가 매우 낮아서 실측 자료만으로 합류부에서 부유사 혼합을 분석하기에는 한계가 존재하기에 대하천의 부유사 혼합 거동 해석에 수치모형이 주로 활용되어 왔다. 본 연구에서는 하천 합류부에서 부유사 거동을 공간적으로 정밀하게 분석하기 위해 드론 기반초분광 영상을 활용하여 하천 합류부에 최적화된 부유사 계측 방법론을 제시하였다. 현장에서 계측한 초분광 자료와 부유사 농도간의 관계를 구축하기 위하여 기계학습모형인 랜덤포레스트(Random Forest) 회귀 모형과 합류부에서 분광 특성이 다른 두 하천의 특성을 정확하게 반영하기 위한 가우시안 혼합 모형 (Gaussian Mixture Model) 기반 초분광 군집화 기법을 결합하였다. 본 연구에서 구축한 방법론을 낙동강과 황강의 합류부에 적용한 결과, 초분광 군집을 통해 두하천 흐름의 경계층을 명확히 구별하였으며, 이를 바탕으로 지류와 본류에 대해 각각 분리된 회귀 모형을 구축하여 복잡한 합류부 근역 경계층에서의 부유사 거동을 보다 정확하게 재현하였다. 또한 나아가서 재현된 고해상도의 부유사 공간분포를 바탕으로 경계층에서 강한 두 흐름이 혼합되어 발생한 와류(Wake)가 부유사 혼합에 미치는 영향을 규명하였고, 하천 합류부에서 발생하는 전단층의 수평방향 대규모 와류가 부유사 혼합 양상에 지배적 영향을 미치는 것으로 확인하였다.

  • PDF

Predicting Forest Fires Using Machine Learning Considering Human Factors (인적요인을 고려한 머신러닝 활용 산림화재 예측)

  • Jin-Myeong Jang;Joo-Chan Kim;Hwa-Joong Kim;Kwang-Tae Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.109-126
    • /
    • 2023
  • Early detection of forest fires is essential in preventing large-scale forest fires. Predicting forest fires serves as a vital early detection method, leading to various related studies. However, many previous studies focused solely on climate and geographic factors, overlooking human factors, which significantly contribute to forest fires. This study aims to develop forest fire prediction models that take into account human, weather and geographical factors. This study conducted a comparative analysis of four machine learning models alongside the logistic regression model, using forest fire data from Gangwon-do spanning 2003 to 2020. The results indicate that XG Boost models performed the best (AUC=0.925), closely followed by Random Forest (AUC=0.920), both of which are machine learning techniques. Lastly, the study analyzed the relative importance of various factors through permutation feature importance analysis to derive operational insights. While meteorological factors showed a greater impact compared to human factors, various human factors were also found to be significant.

Analysis of cycle racing ranking using statistical prediction models (통계적 예측모형을 활용한 경륜 경기 순위 분석)

  • Park, Gahee;Park, Rira;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.25-39
    • /
    • 2017
  • Over 5 million people participate in cycle racing betting and its revenue is more than 2 trillion won. This study predicts the ranking of cycle racing using various statistical analyses and identifies important variables which have influence on ranking. We propose competitive ranking prediction models using various classification and regression methods. Our model can predict rankings with low misclassification rates most of the time. We found that the ranking increases as the grade of a racer decreases and as overall scores increase. Inversely, we can observe that the ranking decreases when the grade of a racer increases, race number four is given, and the ranking of the last race of a racer decreases. We also found that prediction accuracy can be improved when we use centered data per race instead of raw data. However, the real profit from the future data was not high when we applied our prediction model because our model can predict only low-return events well.

Study on Detection for Cochlodinium polykrikoides Red Tide using the GOCI image and Machine Learning Technique (GOCI 영상과 기계학습 기법을 이용한 Cochlodinium polykrikoides 적조 탐지 기법 연구)

  • Unuzaya, Enkhjargal;Bak, Su-Ho;Hwang, Do-Hyun;Jeong, Min-Ji;Kim, Na-Kyeong;Yoon, Hong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.6
    • /
    • pp.1089-1098
    • /
    • 2020
  • In this study, we propose a method to detect red tide Cochlodinium Polykrikoide using by machine learning and geostationary marine satellite images. To learn the machine learning model, GOCI Level 2 data were used, and the red tide location data of the National Fisheries Research and Development Institute was used. The machine learning model used logistic regression model, decision tree model, and random forest model. As a result of the performance evaluation, compared to the traditional GOCI image-based red tide detection algorithm without machine learning (Son et al., 2012) (75%), it was confirmed that the accuracy was improved by about 13~22%p (88~98%). In addition, as a result of comparing and analyzing the detection performance between machine learning models, the random forest model (98%) showed the highest detection accuracy.It is believed that this machine learning-based red tide detection algorithm can be used to detect red tide early in the future and track and monitor its movement and spread.

A Study on the Idol Survivability Prediction Using Machine Learning Techniques : Focused on the Industrial Competitiveness (머신러닝 기법을 활용한 아이돌 생존 가능성 예측 연구 : 산업 경쟁력 증진을 중심으로)

  • Kim, Seul-ah;Ahn, Ju Hyuk;Cui, Fuquan
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.5
    • /
    • pp.291-302
    • /
    • 2020
  • Korean popular music industry, which is lead by "Idol group", has forsaken their fandom all over the world. Therefore, idol groups has become not only an artist but also the most influential people in the Korean economy. A global idol group with a strong fandom can earn more than a trillion-dollar by attracting their global fan's interest in Korea. In other words, it is considerably important to carry the idol to a successful conclusion. This study tries to expect whether the idols can be survived or not at a certain point after their debut by ANN, Decision Tree, Random Forest. We decide that certain point as the three-year and eight-year after their debut, because it is their break-even point year and the year after their average renewal of the contract. In addition, this study also explains which feature is the most important to their survival by feature importance and Logistic regression. In conclusion, features like the number of idol competitors, the number of debut members and the number of the genre are significant. These results shed light on the efficient management of K-Pop idol to improve industrial competitiveness.