• Title/Summary/Keyword: Statistical Forecasting

Search Result 480, Processing Time 0.031 seconds

A hidden Markov model for long term drought forecasting in South Korea

  • Chen, Si;Shin, Ji-Yae;Kim, Tae-Woong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.225-225
    • /
    • 2015
  • Drought events usually evolve slowly in time and their impacts generally span a long period of time. This indicates that the sequence of drought is not completely random. The Hidden Markov Model (HMM) is a probabilistic model used to represent dependences between invisible hidden states which finally result in observations. Drought characteristics are dependent on the underlying generating mechanism, which can be well modelled by the HMM. This study employed a HMM with Gaussian emissions to fit the Standardized Precipitation Index (SPI) series and make multi-step prediction to check the drought characteristics in the future. To estimate the parameters of the HMM, we employed a Bayesian model computed via Markov Chain Monte Carlo (MCMC). Since the true number of hidden states is unknown, we fit the model with varying number of hidden states and used reversible jump to allow for transdimensional moves between models with different numbers of states. We applied the HMM to several stations SPI data in South Korea. The monthly SPI data from January 1973 to December 2012 was divided into two parts, the first 30-year SPI data (January 1973 to December 2002) was used for model calibration and the last 10-year SPI data (January 2003 to December 2012) for model validation. All the SPI data was preprocessed through the wavelet denoising and applied as the visible output in the HMM. Different lead time (T= 1, 3, 6, 12 months) forecasting performances were compared with conventional forecasting techniques (e.g., ANN and ARMA). Based on statistical evaluation performance, the HMM exhibited significant preferable results compared to conventional models with much larger forecasting skill score (about 0.3-0.6) and lower Root Mean Square Error (RMSE) values (about 0.5-0.9).

  • PDF

Development of Demand Forecasting Model for Seoul Shared Bicycle (서울시 공유자전거의 수요 예측 모델 개발)

  • Lim, Heejong;Chung, Kwanghun
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.1
    • /
    • pp.132-140
    • /
    • 2019
  • Recently, many cities around the world introduced and operated shared bicycle system to reduce the traffic and air pollution. Seoul also provides shared bicycle service called as "Ddareungi" since 2015. As the use of shared bicycle increases, the demand for bicycle in each station is also increasing. In addition to the restriction on budget, however, there are managerial issues due to the different demands of each station. Currently, while bicycle rebalancing is used to resolve the huge imbalance of demands among many stations, forecasting uncertain demand at the future is more important problem in practice. In this paper, we develop forecasting model for demand for Seoul shared bicycle using statistical time series analysis and apply our model to the real data. In particular, we apply Holt-Winters method which was used to forecast electricity demand, and perform sensitivity analysis on the parameters that affect on real demand forecasting.

Impact of Activation Functions on Flood Forecasting Model Based on Artificial Neural Networks (홍수량 예측 인공신경망 모형의 활성화 함수에 따른 영향 분석)

  • Kim, Jihye;Jun, Sang-Min;Hwang, Soonho;Kim, Hak-Kwan;Heo, Jaemin;Kang, Moon-Seong
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.63 no.1
    • /
    • pp.11-25
    • /
    • 2021
  • The objective of this study was to analyze the impact of activation functions on flood forecasting model based on Artificial neural networks (ANNs). The traditional activation functions, the sigmoid and tanh functions, were compared with the functions which have been recently recommended for deep neural networks; the ReLU, leaky ReLU, and ELU functions. The flood forecasting model based on ANNs was designed to predict real-time runoff for 1 to 6-h lead time using the rainfall and runoff data of the past nine hours. The statistical measures such as R2, Nash-Sutcliffe Efficiency (NSE), Root Mean Squared Error (RMSE), the error of peak time (ETp), and the error of peak discharge (EQp) were used to evaluate the model accuracy. The tanh and ELU functions were most accurate with R2=0.97 and RMSE=30.1 (㎥/s) for 1-h lead time and R2=0.56 and RMSE=124.6~124.8 (㎥/s) for 6-h lead time. We also evaluated the learning speed by using the number of epochs that minimizes errors. The sigmoid function had the slowest learning speed due to the 'vanishing gradient problem' and the limited direction of weight update. The learning speed of the ELU function was 1.2 times faster than the tanh function. As a result, the ELU function most effectively improved the accuracy and speed of the ANNs model, so it was determined to be the best activation function for ANNs-based flood forecasting.

Water demand forecasting at the DMA level considering sociodemographic and waterworks characteristics (사회인구통계 및 상수도시설 특성을 고려한 소블록 단위 물 수요예측 연구)

  • Saemmul Jin;Dooyong Choi;Kyoungpil Kim;Jayong Koo
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.37 no.6
    • /
    • pp.363-373
    • /
    • 2023
  • Numerous studies have established a correlation between sociodemographic characteristics and water usage, identifying population as a primary independent variable in mid- to long-term demand forecasting. Recent dramatic sociodemographic changes, including urban concentration-rural depopulation, low birth rates-aging population, and the rise in single-person households, are expected to impact water demand and supply patterns. This underscores the necessity for operational and managerial changes in existing water supply systems. While sociodemographic characteristics are regularly surveyed, the conducted surveys use aggregate units that do not align with the actual system. Consequently, many water demand forecasts have been conducted at the administrative district level without adequately considering the water supply system. This study presents an upward water demand forecasting model that accurately reflects real water facilities and consumers. The model comprises three key steps. Firstly, Statistics Korea's SGIS (Statistical Geological Information System) data was reorganized at the DMA level. Secondly, DMAs were classified using the SOM (Self-Organizing Map) algorithm to consider differences in water facilities and consumer characteristics. Lastly, water demand forecasting employed the PCR (Principal Component Regression) method to address multicollinearity and overfitting issues. The performance evaluation of this model was conducted for DMAs classified as rural areas due to the insufficient number of DMAs. The estimation results indicate that the correlation coefficients exceeded 0.9, and the MAPE remained within approximately 10% for the test dataset. This method is expected to be useful for reorganization plans, such as the expansion and contraction of existing facilities.

Forecasting KOSPI 200 Volatility by Volatility Measurements (변동성 측정방법에 따른 KOSPI200 지수의 변동성 예측 비교)

  • Choi, Young-Soo;Lee, Hyun-Jung
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.2
    • /
    • pp.293-308
    • /
    • 2010
  • In this paper, we examine the forecasting KOSPI 200 realized volatility by volatility measurements. The empirical investigation for KOSPI 200 daily returns is done during the period from 3 January 2003 to 29 June 2007. Since Korea Exchange(KRX) will launch VKOSPI futures contract in 2010, forecasting VKOSPI can be an important issue. So we analyze which volatility measurements forecast VKOSPI better. To test this hypothesis, we use 5-minute interval returns to measure realized volatilities. Also, we propose a new methodology that reflects the synchronized bidding and simultaneously takes it account the difference between overnight volatility and intra-daily volatility. The t-test and F-test show that our new realized volatility is not only different from the realized volatility by a conventional method at less than 0.01% significance level, also more stable in summary statistics. We use the correlation analysis, regression analysis, cross validation test to investigate the forecast performance. The empirical result shows that the realized volatility we propose is better than other volatilities, including historical volatility, implied volatility, and convention realized volatility, for forecasting VKOSPI. Also, the regression analysis on the predictive abilities for realized volatility, which is measured by our new methodology and conventional one, shows that VKOSPI is an efficient estimator compared to historical volatility and CRR implied volatility.

Electricity Demand Forecasting for Daily Peak Load with Seasonality and Temperature Effects (계절성과 온도를 고려한 일별 최대 전력 수요 예측 연구)

  • Jung, Sang-Wook;Kim, Sahm
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.5
    • /
    • pp.843-853
    • /
    • 2014
  • Accurate electricity demand forecasting for daily peak load is essential for management and planning at electrical facilities. In this paper, we rst, introduce the several time series models that forecast daily peak load and compare the forecasting performance of the models based on Mean Absolute Percentage Error(MAPE). The results show that the Reg-AR-GARCH model outperforms other competing models that consider Cooling Degree Day(CDD) and Heating Degree Day(HDD) as well as seasonal components.

Forecasting probabilities of earthquake in Korea based on seismological data (지진 관측자료를 기반으로 한 한반도 지진 발생 확률 예측)

  • Choi, Seowon;Jang, Woncheol
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.759-774
    • /
    • 2017
  • Earthquake concerns have grown after a remarkable earthquake incident on September 12th, 2016 in Gyeongju, Korea. Earthquake forecasting is gaining in importance in order to guarantee infrastructure safety and develop protection policies. In this paper, we adopt a power-law distribution model to fit past earthquake occurrences in Korea with various historical and modern seismological records. We estimated power-law distribution parameters using empirical distributions and calculated the future probabilities for large earthquake events based on our model. We provide the probability that a future event has a larger magnitude than given levels, and the probability that a future event over certain levels will occur in a given period of time. This model contributes to the assessment of latent seismological risk in Korea by estimating future earthquake probabilities.

Deciding the Optimal Shutdown Time Incorporating the Accident Forecasting Model (원자력 발전소 사고 예측 모형과 병합한 최적 운행중지 결정 모형)

  • Yang, Hee Joong
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.41 no.4
    • /
    • pp.171-178
    • /
    • 2018
  • Recently, the continuing operation of nuclear power plants has become a major controversial issue in Korea. Whether to continue to operate nuclear power plants is a matter to be determined considering many factors including social and political factors as well as economic factors. But in this paper we concentrate only on the economic factors to make an optimum decision on operating nuclear power plants. Decisions should be based on forecasts of plant accident risks and large and small accident data from power plants. We outline the structure of a decision model that incorporate accident risks. We formulate to decide whether to shutdown permanently, shutdown temporarily for maintenance, or to operate one period of time and then periodically repeat the analysis and decision process with additional information about new costs and risks. The forecasting model to predict nuclear power plant accidents is incorporated for an improved decision making. First, we build a one-period decision model and extend this theory to a multi-period model. In this paper we utilize influence diagrams as well as decision trees for modeling. And bayesian statistical approach is utilized. Many of the parameter values in this model may be set fairly subjective by decision makers. Once the parameter values have been determined, the model will be able to present the optimal decision according to that value.

Deep learning forecasting for financial realized volatilities with aid of implied volatilities and internet search volumes (금융 실현변동성을 위한 내재변동성과 인터넷 검색량을 활용한 딥러닝)

  • Shin, Jiwon;Shin, Dong Wan
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.93-104
    • /
    • 2022
  • In forecasting realized volatility of the major US stock price indexes (S&P 500, Russell 2000, DJIA, Nasdaq 100), internet search volume reflecting investor's interests and implied volatility are used to improve forecast via a deep learning method of the LSTM. The LSTM method combined with search volume index produces better forecasts than existing standard methods of the vector autoregressive (VAR) and the vector error correction (VEC) models. It also beats the recently proposed vector error correction heterogeneous autoregressive (VECHAR) model which takes advantage of the cointegration relation between realized volatility and implied volatility.

A Comparison Study for Mortality Forecasting Models by Average Life Expectancy (평균수명을 이용한 사망률 예측모형 비교연구)

  • Jeong, Seung-Hwan;Kim, Kee-Whan
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.1
    • /
    • pp.115-125
    • /
    • 2011
  • By use of a mortality forecasting model and a life table, forecasting the average life expectancy is an effective way to evaluate the future mortality level. There are differences between the actual values of average life expectancy at present and the forecasted values of average life expectancy in population projection 2006 from Statistics Korea. The reason is that the average life expectancy forecasts did not reflect the increasing speed of the actual ones. The main causes of the problem may be errors from judgment for projection, from choice, or use of a mortality forecasting model. In this paper, we focus on the choice of the mortality forecasting model to inspect this problem. Statistics Korea should take a mortality forecasting model with considerable investigation to proceed population projection 2011 without the errors observed in population projection 2006. We compare the five mortality forecasting models that are the LC(Lee and Carter) model used widely and its variants, and the HP8(Heligman and Pollard 8 parameter) model for handling death probability. We make average life expectancy forecasts by sex using modeling results from 2010 to 2030 and compare with that of the population projection 2006 during the same period. The average life expectancy from all five models are forecasted higher than that of the population projection 2006. Therefore, we show that the new average life expectancy forecasts are relatively suitable to the future mortality level.