• Title/Summary/Keyword: Neural network prediction

Search Result 1,943, Processing Time 0.029 seconds

A Hybrid Approach for Rainfall-Runoff Prediction in Yongdam Dam Basin in Korea (용담댐 유역의 강우-유출 예측을 위한 하이브리드 접근법)

  • Yeoung Rok Oh;Kyung Soo Jun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.70-70
    • /
    • 2023
  • 강우 발생 중 용담댐 상류로부터 용담댐으로 유입되는 유입량을 정확하게 예측하는 것은 하류 지역의 홍수 피해를 최소화하기 위한 댐의 적절한 운영에 필수적이다. 물리 기반 강우-유출 시뮬레이션 모형은 물리적 과정의 이해를 바탕으로 홍수 예측 분야에 광범위하게 사용되고 있다. 그러나 복잡한 물리 과정을 완벽히 이해하는 것은 거의 불가능하므로 다양한 가정 조건들을 이용해 복잡한 과정을 단순화하여 계산해야 하는 한계가 존재한다. 최근에는 방대한 데이터의 축적과 컴퓨터 능력의 향상으로 인해 데이터 기반 모형이 다양한 실무 문제를 해결하는 데 강력한 도구로 활용되고 있을 뿐 아니라 시뮬레이션 및 예측 등에도 다양하게 이용되고 있다. 그러나 예측 시간이 늘어날수록 입력자료로 이용되는 과거 자료와 출력자료로 이용되는 미래자료와의 상관관계가 줄어들어 모형의 성능이 저하된다. 따라서 본 연구에서는 용담댐의 시간당 유입량을 예측하기 위해 물리 기반 강우-유출 모형과 오차 보정 모형을 결합한 하이브리드 접근 방식을 제안한다. 물리 기반 강우-유출 모형으로는 HEC-HMS 모형을 사용하였으며, 오차 보정 모형에는 기계학습 모형인 인공신경망(Artificial Neural Network, ANN) 모형을 사용하였다. HEC-HMS 모형, ANN 및 하이브리드 모형(HEC-HMS + ANN)의 성능을 비교하기 위해 20 개의 홍수 사상을 모형 구축 및 검증에 사용하였다. 그 결과 하이브리드 모형은 예측 시간이 늘어날수록 HEC-HMS 및 ANN 모형보다 우수한 성능을 나타냈다. 물리모형에 기계학습을 이용한 오차 보정 절차를 통합한 경우 홍수 유출 예측의 정확성이 향상되었다. 다양한 모형의 비교 결과 본 연구에서 적용한 하이브리드 모형이 물리기반 강우-유출 모형 및 순수 기계학습 모형보다 우수한 성능을 보여줌으로써, 하이브리드 모형은 물리모형과 순수 기계학습 모형의 단점들을 보완하는데 이용할 수 있음을 나타낸다. 이 연구의 주요 목적은 강우-유출 시물레이션 모형의 오차 보정 기술에 대한 더 깊은 이해를 제공하는데 있다.

  • PDF

Prediction of spring precipitation in the Geum River basin using global climate indices and artificial neural network model (글로벌 기후지수와 인공신경망모형을 이용한 금강권역의 봄철 강수량 예측)

  • Chul-Gyum Kim;Jeongwoo Lee;Hyeonjun Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.292-292
    • /
    • 2023
  • 본 연구에서는 인공신경망을 이용한 통계적 모형을 구성하여 금강권역의 봄철(3~5월) 강수량 예측을 수행하였다. 통계적 모형의 예측인자로서는 NOAA 등에서 제공하는 AAO, AMM, AO 등 36종의 기후지수와 대상권역인 금강권역의 강수량, 기온 등의 기상인자 8종 등 총 44종의 기후지수를 활용하였다. 예측대상기간을 기준으로 선행기간(1~18개월)에 따른 상관성을 분석하여 상관도가 높은 10개의 기후지수를 예측인자로 선정하였다. 예측모형 형태는 10개의 입력층과 1개의 은닉층으로 되어 있는 인공신경망모형을 구성하였다. 모형 구성과정에서의 불확실성을 최소화하고 예측모형의 적합도를 높이기 위해 예측대상기간을 기준으로 과거 40년간의 자료에 대해 임의로 20년간 자료를 선별하여 모형을 구성하고, 너머지 기간에 대해 검증하는 무작위 교차검증을 반복하여, 예측대상기간 및 예측시점에 따라 각각 적합도가 높은 1000개의 예측모형을 선별하였다. 과거기간(1991~2022년)을 대상으로 예측시점에 따라 각 연도별 1000개의 예측결과를 도출하여, 실제 해당년도의 관측값과의 비교를 통해 예측성을 분석하였다. 예측성은 크게 예측치의 최대값과 최소값 범위 및 예측치의 25%~75% 범위 안에 관측치가 포함될 확률, 그리고 과거 관측값의 3분위 구간을 기준으로 한 예측확률 등을 평가하였다. 관측치가 예측치의 범위 안에 포함될 확률은 평균 87.5%, 예측치의 25~75% 범위 안에 포함될 확률은 30.2%로 나타났으며, 3분위 예측확률은 35.6%로 분석되었다. 관측값과의 일대일 비교는 정확도가 떨어지지만 3분위 예측확률이 33.3% 이상인 점으로 볼 때 예측성은 확보된다고 볼 수 있다. 다만, 우리나라 강수량의 불규칙성과 통계적 모형 특성상 과거 관측되지 않은 패턴에 대해서는 예측이 어려운 문제가 있어, 특정년도의 예측결과가 관측치를 크게 벗어나는 경우도 종종 나타나고 있다.

  • PDF

Futures Price Prediction based on News Articles using LDA and LSTM (LDA와 LSTM를 응용한 뉴스 기사 기반 선물가격 예측)

  • Jin-Hyeon Joo;Keun-Deok Park
    • Journal of Industrial Convergence
    • /
    • v.21 no.1
    • /
    • pp.167-173
    • /
    • 2023
  • As research has been published to predict future data using regression analysis or artificial intelligence as a method of analyzing economic indicators. In this study, we designed a system that predicts prospective futures prices using artificial intelligence that utilizes topic probability data obtained from past news articles using topic modeling. Topic probability distribution data for each news article were obtained using the Latent Dirichlet Allocation (LDA) method that can extract the topic of a document from past news articles via unsupervised learning. Further, the topic probability distribution data were used as the input for a Long Short-Term Memory (LSTM) network, a derivative of Recurrent Neural Networks (RNN) in artificial intelligence, in order to predict prospective futures prices. The method proposed in this study was able to predict the trend of futures prices. Later, this method will also be able to predict the trend of prices for derivative products like options. However, because statistical errors occurred for certain data; further research is required to improve accuracy.

An EEG-fNIRS Hybridization Technique in the Multi-class Classification of Alzheimer's Disease Facilitated by Machine Learning (기계학습 기반 알츠하이머성 치매의 다중 분류에서 EEG-fNIRS 혼성화 기법)

  • Ho, Thi Kieu Khanh;Kim, Inki;Jeon, Younghoon;Song, Jong-In;Gwak, Jeonghwan
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.305-307
    • /
    • 2021
  • Alzheimer's Disease (AD) is a cognitive disorder characterized by memory impairment that can be assessed at early stages based on administering clinical tests. However, the AD pathophysiological mechanism is still poorly understood due to the difficulty of distinguishing different levels of AD severity, even using a variety of brain modalities. Therefore, in this study, we present a hybrid EEG-fNIRS modalities to compensate for each other's weaknesses with the help of Machine Learning (ML) techniques for classifying four subject groups, including healthy controls (HC) and three distinguishable groups of AD levels. A concurrent EEF-fNIRS setup was used to record the data from 41 subjects during Oddball and 1-back tasks. We employed both a traditional neural network (NN) and a CNN-LSTM hybrid model for fNIRS and EEG, respectively. The final prediction was then obtained by using majority voting of those models. Classification results indicated that the hybrid EEG-fNIRS feature set achieved a higher accuracy (71.4%) by combining their complementary properties, compared to using EEG (67.9%) or fNIRS alone (68.9%). These findings demonstrate the potential of an EEG-fNIRS hybridization technique coupled with ML-based approaches for further AD studies.

  • PDF

Enhancing Recommender Systems by Fusing Diverse Information Sources through Data Transformation and Feature Selection

  • Thi-Linh Ho;Anh-Cuong Le;Dinh-Hong Vu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.5
    • /
    • pp.1413-1432
    • /
    • 2023
  • Recommender systems aim to recommend items to users by taking into account their probable interests. This study focuses on creating a model that utilizes multiple sources of information about users and items by employing a multimodality approach. The study addresses the task of how to gather information from different sources (modalities) and transform them into a uniform format, resulting in a multi-modal feature description for users and items. This work also aims to transform and represent the features extracted from different modalities so that the information is in a compatible format for integration and contains important, useful information for the prediction model. To achieve this goal, we propose a novel multi-modal recommendation model, which involves extracting latent features of users and items from a utility matrix using matrix factorization techniques. Various transformation techniques are utilized to extract features from other sources of information such as user reviews, item descriptions, and item categories. We also proposed the use of Principal Component Analysis (PCA) and Feature Selection techniques to reduce the data dimension and extract important features as well as remove noisy features to increase the accuracy of the model. We conducted several different experimental models based on different subsets of modalities on the MovieLens and Amazon sub-category datasets. According to the experimental results, the proposed model significantly enhances the accuracy of recommendations when compared to SVD, which is acknowledged as one of the most effective models for recommender systems. Specifically, the proposed model reduces the RMSE by a range of 4.8% to 21.43% and increases the Precision by a range of 2.07% to 26.49% for the Amazon datasets. Similarly, for the MovieLens dataset, the proposed model reduces the RMSE by 45.61% and increases the Precision by 14.06%. Additionally, the experimental results on both datasets demonstrate that combining information from multiple modalities in the proposed model leads to superior outcomes compared to relying on a single type of information.

Effective Drought Prediction Based on Machine Learning (머신러닝 기반 효과적인 가뭄예측)

  • Kim, Kyosik;Yoo, Jae Hwan;Kim, Byunghyun;Han, Kun-Yeun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.326-326
    • /
    • 2021
  • 장기간에 걸쳐 넓은 지역에 대해 발생하는 가뭄을 예측하기위해 많은 학자들의 기술적, 학술적 시도가 있어왔다. 본 연구에서는 복잡한 시계열을 가진 가뭄을 전망하는 방법 중 시나리오에 기반을 둔 가뭄전망 방법과 실시간으로 가뭄을 예측하는 비시나리오 기반의 방법 등을 이용하여 미래 가뭄전망을 실시했다. 시나리오에 기반을 둔 가뭄전망 방법으로는, 3개월 GCM(General Circulation Model) 예측 결과를 바탕으로 2009년도 PDSI(Palmer Drought Severity Index) 가뭄지수를 산정하여 가뭄심도에 대한 단기예측을 실시하였다. 또, 통계학적 방법과 물리적 모델(Physical model)에 기반을 둔 확정론적 수치해석 방법을 이용하여 비시나리오 기반 가뭄을 예측했다. 기존 가뭄을 통계학적 방법으로 예측하기 위해서 시도된 대표적인 방법으로 ARIMA(Autoregressive Integrated Moving Average) 모델의 예측에 대한 한계를 극복하기위해 서포트 벡터 회귀(support vector regression, SVR)와 웨이블릿(wavelet neural network) 신경망을 이용해 SPI를 측정하였다. 최적모델구조는 RMSE(root mean square error), MAE(mean absolute error) 및 R(correlation Coefficient)를 통해 선정하였고, 1-6개월의 선행예보 시간을 갖고 가뭄을 전망하였다. 그리고 SPI를 이용하여, 마코프 연쇄(Markov chain) 및 대수선형모델(log-linear model)을 적용하여 SPI기반 가뭄예측의 정확도를 검증하였으며, 터키의 아나톨리아(Anatolia) 지역을 대상으로 뉴로퍼지모델(Neuro-Fuzzy)을 적용하여 1964-2006년 기간의 월평균 강수량과 SPI를 바탕으로 가뭄을 예측하였다. 가뭄 빈도와 패턴이 불규칙적으로 변하며 지역별 강수량의 양극화가 심화됨에 따라 가뭄예측의 정확도를 높여야 하는 요구가 커지고 있다. 본 연구에서는 복잡하고 비선형성으로 이루어진 가뭄 패턴을 기상학적 가뭄의 정도를 나타내는 표준강수증발지수(SPEI, Standardized Precipitation Evapotranspiration Index)인 월SPEI와 일SPEI를 기계학습모델에 적용하여 예측개선 모형을 개발하고자 한다.

  • PDF

Demand Forecast For Empty Containers Using MLP (MLP를 이용한 공컨테이너 수요예측)

  • DongYun Kim;SunHo Bang;Jiyoung Jang;KwangSup Shin
    • The Journal of Bigdata
    • /
    • v.6 no.2
    • /
    • pp.85-98
    • /
    • 2021
  • The pandemic of COVID-19 further promoted the imbalance in the volume of imports and exports among countries using containers, which worsened the shortage of empty containers. Since it is important to secure as many empty containers as the appropriate demand for stable and efficient port operation, measures to predict demand for empty containers using various techniques have been studied so far. However, it was based on long-term forecasts on a monthly or annual basis rather than demand forecasts that could be used directly by ports and shipping companies. In this study, a daily and weekly prediction method using an actual artificial neural network is presented. In details, the demand forecasting model has been developed using multi-layer perceptron and multiple linear regression model. In order to overcome the limitation from the lack of data, it was manipulated considering the business process between the loaded container and empty container, which the fully-loaded container is converted to the empty container. From the result of numerical experiment, it has been developed the practically applicable forecasting model, even though it could not show the perfect accuracy.

Predictive modeling algorithms for liver metastasis in colorectal cancer: A systematic review of the current literature

  • Isaac Seow-En;Ye Xin Koh;Yun Zhao;Boon Hwee Ang;Ivan En-Howe Tan;Aik Yong Chok;Emile John Kwong Wei Tan;Marianne Kit Har Au
    • Annals of Hepato-Biliary-Pancreatic Surgery
    • /
    • v.28 no.1
    • /
    • pp.14-24
    • /
    • 2024
  • This study aims to assess the quality and performance of predictive models for colorectal cancer liver metastasis (CRCLM). A systematic review was performed to identify relevant studies from various databases. Studies that described or validated predictive models for CRCLM were included. The methodological quality of the predictive models was assessed. Model performance was evaluated by the reported area under the receiver operating characteristic curve (AUC). Of the 117 articles screened, seven studies comprising 14 predictive models were included. The distribution of included predictive models was as follows: radiomics (n = 3), logistic regression (n = 3), Cox regression (n = 2), nomogram (n = 3), support vector machine (SVM, n = 2), random forest (n = 2), and convolutional neural network (CNN, n = 2). Age, sex, carcinoembryonic antigen, and tumor staging (T and N stage) were the most frequently used clinicopathological predictors for CRCLM. The mean AUCs ranged from 0.697 to 0.870, with 86% of the models demonstrating clear discriminative ability (AUC > 0.70). A hybrid approach combining clinical and radiomic features with SVM provided the best performance, achieving an AUC of 0.870. The overall risk of bias was identified as high in 71% of the included studies. This review highlights the potential of predictive modeling to accurately predict the occurrence of CRCLM. Integrating clinicopathological and radiomic features with machine learning algorithms demonstrates superior predictive capabilities.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

VKOSPI Forecasting and Option Trading Application Using SVM (SVM을 이용한 VKOSPI 일 중 변화 예측과 실제 옵션 매매에의 적용)

  • Ra, Yun Seon;Choi, Heung Sik;Kim, Sun Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.177-192
    • /
    • 2016
  • Machine learning is a field of artificial intelligence. It refers to an area of computer science related to providing machines the ability to perform their own data analysis, decision making and forecasting. For example, one of the representative machine learning models is artificial neural network, which is a statistical learning algorithm inspired by the neural network structure of biology. In addition, there are other machine learning models such as decision tree model, naive bayes model and SVM(support vector machine) model. Among the machine learning models, we use SVM model in this study because it is mainly used for classification and regression analysis that fits well to our study. The core principle of SVM is to find a reasonable hyperplane that distinguishes different group in the data space. Given information about the data in any two groups, the SVM model judges to which group the new data belongs based on the hyperplane obtained from the given data set. Thus, the more the amount of meaningful data, the better the machine learning ability. In recent years, many financial experts have focused on machine learning, seeing the possibility of combining with machine learning and the financial field where vast amounts of financial data exist. Machine learning techniques have been proved to be powerful in describing the non-stationary and chaotic stock price dynamics. A lot of researches have been successfully conducted on forecasting of stock prices using machine learning algorithms. Recently, financial companies have begun to provide Robo-Advisor service, a compound word of Robot and Advisor, which can perform various financial tasks through advanced algorithms using rapidly changing huge amount of data. Robo-Adviser's main task is to advise the investors about the investor's personal investment propensity and to provide the service to manage the portfolio automatically. In this study, we propose a method of forecasting the Korean volatility index, VKOSPI, using the SVM model, which is one of the machine learning methods, and applying it to real option trading to increase the trading performance. VKOSPI is a measure of the future volatility of the KOSPI 200 index based on KOSPI 200 index option prices. VKOSPI is similar to the VIX index, which is based on S&P 500 option price in the United States. The Korea Exchange(KRX) calculates and announce the real-time VKOSPI index. VKOSPI is the same as the usual volatility and affects the option prices. The direction of VKOSPI and option prices show positive relation regardless of the option type (call and put options with various striking prices). If the volatility increases, all of the call and put option premium increases because the probability of the option's exercise possibility increases. The investor can know the rising value of the option price with respect to the volatility rising value in real time through Vega, a Black-Scholes's measurement index of an option's sensitivity to changes in the volatility. Therefore, accurate forecasting of VKOSPI movements is one of the important factors that can generate profit in option trading. In this study, we verified through real option data that the accurate forecast of VKOSPI is able to make a big profit in real option trading. To the best of our knowledge, there have been no studies on the idea of predicting the direction of VKOSPI based on machine learning and introducing the idea of applying it to actual option trading. In this study predicted daily VKOSPI changes through SVM model and then made intraday option strangle position, which gives profit as option prices reduce, only when VKOSPI is expected to decline during daytime. We analyzed the results and tested whether it is applicable to real option trading based on SVM's prediction. The results showed the prediction accuracy of VKOSPI was 57.83% on average, and the number of position entry times was 43.2 times, which is less than half of the benchmark (100 times). A small number of trading is an indicator of trading efficiency. In addition, the experiment proved that the trading performance was significantly higher than the benchmark.