• Title/Summary/Keyword: accurate prediction

Search Result 2,202, Processing Time 0.03 seconds

A Grey Wolf Optimized- Stacked Ensemble Approach for Nitrate Contamination Prediction in Cauvery Delta

  • Kalaivanan K;Vellingiri J
    • Economic and Environmental Geology
    • /
    • v.57 no.3
    • /
    • pp.329-342
    • /
    • 2024
  • The exponential increase in nitrate pollution of river water poses an immediate threat to public health and the environment. This contamination is primarily due to various human activities, which include the overuse of nitrogenous fertilizers in agriculture and the discharge of nitrate-rich industrial effluents into rivers. As a result, the accurate prediction and identification of contaminated areas has become a crucial and challenging task for researchers. To solve these problems, this work leads to the prediction of nitrate contamination using machine learning approaches. This paper presents a novel approach known as Grey Wolf Optimizer (GWO) based on the Stacked Ensemble approach for predicting nitrate pollution in the Cauvery Delta region of Tamilnadu, India. The proposed method is evaluated using a Cauvery River dataset from the Tamilnadu Pollution Control Board. The proposed method shows excellent performance, achieving an accuracy of 93.31%, a precision of 93%, a sensitivity of 97.53%, a specificity of 94.28%, an F1-score of 95.23%, and an ROC score of 95%. These impressive results underline the demonstration of the proposed method in accurately predicting nitrate pollution in river water and ultimately help to make informed decisions to tackle these critical environmental problems.

Development of Medical Cost Prediction Model Based on the Machine Learning Algorithm (머신러닝 알고리즘 기반의 의료비 예측 모델 개발)

  • Han Bi KIM;Dong Hoon HAN
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.11-16
    • /
    • 2023
  • Accurate hospital case modeling and prediction are crucial for efficient healthcare. In this study, we demonstrate the implementation of regression analysis methods in machine learning systems utilizing mathematical statics and machine learning techniques. The developed machine learning model includes Bayesian linear, artificial neural network, decision tree, decision forest, and linear regression analysis models. Through the application of these algorithms, corresponding regression models were constructed and analyzed. The results suggest the potential of leveraging machine learning systems for medical research. The experiment aimed to create an Azure Machine Learning Studio tool for the speedy evaluation of multiple regression models. The tool faciliates the comparision of 5 types of regression models in a unified experiment and presents assessment results with performance metrics. Evaluation of regression machine learning models highlighted the advantages of boosted decision tree regression, and decision forest regression in hospital case prediction. These findings could lay the groundwork for the deliberate development of new directions in medical data processing and decision making. Furthermore, potential avenues for future research may include exploring methods such as clustering, classification, and anomaly detection in healthcare systems.

Predicting Organic Matter content in Korean Soils Using Regression rules on Visible-Near Infrared Diffuse Reflectance Spectra

  • Chun, Hyen-Chung;Hong, Suk-Young;Song, Kwan-Cheol;Kim, Yi-Hyun;Hyun, Byung-Keun;Minasny, Budiman
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.45 no.4
    • /
    • pp.497-502
    • /
    • 2012
  • This study investigates the prediction of soil OM on Korean soils using the Visible-Near Infrared (Vis-NIR) spectroscopy. The ASD Field Spec Pro was used to acquire the reflectance of soil samples to visible to near-infrared radiation (350 to 2500 nm). A total of 503 soil samples from 61 Korean soil series were scanned using the instrument and OM was measured using the Walkley and Black method. For data analysis, the spectra were resampled from 500-2450 nm with 4 nm spacing and converted to the $1^{st}$ derivative of absorbance (log (1/R)). Partial least squares regression (PLSR) and regression rules model (Cubist) were applied to predict soil OM. Regression rules model estimates the target value by building conditional rules, and each rule contains a linear expression predicting OM from selected absorbance values. The regression rules model was shown to give a better prediction compared to PLSR. Although the prediction for Andisols had a larger error, soil order was not found to be useful in stratifying the prediction model. The stratification used by Cubist was mainly based on absorbance at wavelengths of 850 and 2320 nm, which corresponds to the organic absorption bands. These results showed that there could be more information on soil properties useful to classify or group OM data from Korean soils. In conclusion, this study shows it is possible to develop good prediction model of OM from Korean soils and provide data to reexamine the existing prediction models for more accurate prediction.

Purchase Prediction Model using the Support Vector Machine (Support Vector Machine을 이용한 고객구매예측모형)

  • Ahn, Hyun-Chul;Han, In-Goo;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.11 no.3
    • /
    • pp.69-81
    • /
    • 2005
  • As the competition in business becomes severe, companies are focusing their capacity on customer relationship management (CRM) for survival. One of the important issues in CRM is to build a purchase prediction model, which classifies customers into either purchasing or non-purchasing groups. Until now, various techniques for building purchase prediction models have been proposed. However, they have been criticized because their performances are generally low, or it requires much effort to build and maintain them. Thus, in this study, we propose the support vector machine (SVM) a tool for building a purchase prediction model. The SVM is known as the technique that not only produces accurate prediction results but also enables training with the small sample size. To validate the usefulness of SVM, we apply it and some of other comparative techniques to a real-world purchase prediction case. Experimental results show that SVM outperforms all the comparative models including logistic regression and artificial neural networks.

  • PDF

Non-point Source Pollution Modeling Using AnnAGNPS Model for a Bushland Catchment (AnnAGNPS 모형을 이용한 관목림지의 비점오염 모의)

  • Choi Kyung-Sook
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.47 no.4
    • /
    • pp.65-74
    • /
    • 2005
  • AnnAGNPS model was applied to a catchment mainly occupied with bushland for modeling non-point source pollution. Since the single event model cannot handle events longer than 24 hours duration, the event-based calibration was carried out using the continuous mode. As event flows affect sediment and nutrient generation and transport, the calibration of the model was performed in three steps: Hydrologic, Sediment and Nutrient calibrations. The results from hydrologic calibration for the catchment indicate a good prediction of the model with average ARE(Absolute Relative Error) of $24.6\%$ fur the runoff volume and $12\%$ for the peak flow. For the sediment calibration, the average ARE was $198.8\%$ indicating acceptable model performance for the sediment prediction. The predicted TN(Total Nitrogen) and TP(Total Phosphorus) were also found to be acceptable as the average ARE for TN and TP were $175.5\%\;and\;126.5\%$, respectively. The AnnAGNPS model was therefore approved to be appropriate to model non-point source pollution in bushland catchments. In general, the model was likely to result in underestimation for the larger events and overestimation fur the smaller events for the water quality predictions. It was also observed that the large errors in the hydrologic prediction also produced high errors in sediment and nutrient prediction. This was probably due to error propagation in which the error in the hydrologic prediction influenced the generation of error in the water quality prediction. Accurate hydrologic calibration should be hence obtained for a reliable water quality prediction.

Prediction of the Movement Directions of Index and Stock Prices Using Extreme Gradient Boosting (익스트림 그라디언트 부스팅을 이용한 지수/주가 이동 방향 예측)

  • Kim, HyoungDo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.9
    • /
    • pp.623-632
    • /
    • 2018
  • Both investors and researchers are attentive to the prediction of stock price movement directions since the accurate prediction plays an important role in strategic decision making on stock trading. According to previous studies, taken together, one can see that different factors are considered depending on stock markets and prediction periods. This paper aims to analyze what data mining techniques show better performance with some representative index and stock price datasets in the Korea stock market. In particular, extreme gradient boosting technique, proving itself to be the fore-runner through recent open competitions, is applied to the prediction problem. Its performance has been analyzed in comparison with other data mining techniques reported good in the prediction of stock price movement directions such as random forests, support vector machines, and artificial neural networks. Through experiments with the index/price datasets of 12 years, it is identified that the gradient boosting technique is the best in predicting the movement directions after 1 to 4 days with a few partial equivalence to the other techniques.

A Study on the Emission Characteristics and Prediction of VOCs (Volatile Organic Compounds) using Small Chamber Method (소형챔버법을 이용한 휘발성유기화합물(VOCs) 방출특성 및 예측에 관한 연구)

  • Pang, Seung-Ki;Sohn, Jang-Yeul;Lee, Kwang-Ho
    • KIEAE Journal
    • /
    • v.4 no.4
    • /
    • pp.11-18
    • /
    • 2004
  • In this study, the measurement system was developed for the measurement of pollutants from building materials, and specimens were made with concrete, gypsum board, mortar and wall paper. Characteristics of VOCs and TVOC concentration and Emission Factor as a function of time were assessed, and the conclusion was drawn as follows. (1) From predicting TVOC concentration decrease of specimen 7 with the wall paper attached to the concrete, the graph may become linear by converting the value of y-axis into the log function, and the prediction equation can be expressed as $y=34906{\ast}e^{-0.0093{\ast}time}$. Moreover, chi-square value was 0.83 which is relatively high value, indicating that TVOC concentration can be properly predicted if the same materials are used indoors. (2) From predicting VOCs Emission Factor decrease of specimen 7, the prediction equation can be expressed as $EF=15111{\ast}e^{-0.0093{\ast}time}$, and chi-square value was 0.83. (3) From predicting TVOC concentration decrease of specimen 7, prediction equation can be considered to be $y=254323{\ast}(1-e^{-0.1046{\ast}time})$, and chi-square was 0.994 which is significantly high value, indicating that indoor TVOC concentration can be properly predicted if the same materials are used indoors. Furthermore, the prediction of concentration decrease using cumulative value of hourly measured concentration is considered to be more accurate than that using just hourly measured value directly. (4) From predicting Emission Factor decrease with cumulative hourly data of Emission Factor, chi-square appeared to be higher than that by just using hourly data of Emission Factor directly. Therefore, the prediction of Emission Factor with cumulative hourly data can provide more reliable prediction equation than the case by using just hourly concentration directly.

A Study of Improvement of a Prediction Accuracy about Wind Resources based on Training Period of Bayesian Kalman Filter Technique (베이지안 칼만 필터 기법의 훈련 기간에 따른 풍력 자원 예측 정확도 향상성 연구)

  • Lee, Soon-Hwan
    • Journal of the Korean earth science society
    • /
    • v.38 no.1
    • /
    • pp.11-23
    • /
    • 2017
  • The short term predictability of wind resources is an important factor in evaluating the economic feasibility of a wind power plant. As a method of improving the predictability, a Bayesian Kalman filter is applied as the model data postprocessing. At this time, a statistical training period is needed to evaluate the correlation between estimated model and observation data for several Kalman training periods. This study was quantitatively analyzes for the prediction characteristics according to different training periods. The prediction of the temperature and wind speed with 3-day short term Bayesian Kalman training at Taebaek area is more reasonable than that in applying the other training periods. In contrast, it may produce a good prediction result in Ieodo when applying the training period for more than six days. The prediction performance of a Bayesian Kalman filter is clearly improved in the case in which the Weather Research Forecast (WRF) model prediction performance is poor. On the other hand, the performance improvement of the WRF prediction is weak at the accurate point.

Improving the Accuracy of a Heliocentric Potential (HCP) Prediction Model for the Aviation Radiation Dose

  • Hwang, Junga;Yoon, Kyoung-Won;Jo, Gyeongbok;Noh, Sung-Jun
    • Journal of Astronomy and Space Sciences
    • /
    • v.33 no.4
    • /
    • pp.279-285
    • /
    • 2016
  • The space radiation dose over air routes including polar routes should be carefully considered, especially when space weather shows sudden disturbances such as coronal mass ejections (CMEs), flares, and accompanying solar energetic particle events. We recently established a heliocentric potential (HCP) prediction model for real-time operation of the CARI-6 and CARI-6M programs. Specifically, the HCP value is used as a critical input value in the CARI-6/6M programs, which estimate the aviation route dose based on the effective dose rate. The CARI-6/6M approach is the most widely used technique, and the programs can be obtained from the U.S. Federal Aviation Administration (FAA). However, HCP values are given at a one month delay on the FAA official webpage, which makes it difficult to obtain real-time information on the aviation route dose. In order to overcome this critical limitation regarding the time delay for space weather customers, we developed a HCP prediction model based on sunspot number variations (Hwang et al. 2015). In this paper, we focus on improvements to our HCP prediction model and update it with neutron monitoring data. We found that the most accurate method to derive the HCP value involves (1) real-time daily sunspot assessments, (2) predictions of the daily HCP by our prediction algorithm, and (3) calculations of the resultant daily effective dose rate. Additionally, we also derived the HCP prediction algorithm in this paper by using ground neutron counts. With the compensation stemming from the use of ground neutron count data, the newly developed HCP prediction model was improved.

Experimental Study on Long-Term Prediction of Rebar Price Using Deep Learning Recursive Prediction Meothod (딥러닝의 반복적 예측방법을 활용한 철근 가격 장기예측에 관한 실험적 연구)

  • Lee, Yong-Seong;Kim, Kyung-Hwan
    • Korean Journal of Construction Engineering and Management
    • /
    • v.22 no.3
    • /
    • pp.21-30
    • /
    • 2021
  • This study proposes a 5-month rebar price prediction method using the recursive prediction method of deep learning. This approach predicts a long-term point in time by repeating the process of predicting all the characteristics of the input data and adding them to the original data and predicting the next point in time. The predicted average accuracy of the rebar prices for one to five months is approximately 97.24% in the manner presented in this study. Through the proposed method, it is expected that more accurate cost planning will be possible than the existing method by supplementing the systematicity of the price estimation method through human experience and judgment. In addition, it is expected that the method presented in this study can be utilized in studies that predict long-term prices using time series data including building materials other than rebar.