• Title/Summary/Keyword: Statistical Forecasting

Search Result 484, Processing Time 0.022 seconds

Statistical Modelling and Forecasting of Cervix Cancer Cases in Radiation Oncology Treatment: A Hospital Based Study from Western Nepal

  • Sathian, Brijesh;Fazil, Abul;Sreedharan, Jayadevan;Pant, Sadip;Kakria, Anjali;Sharan, Krishna;Rajesh, E.;Vishrutha, K.V.;Shetty, Soumya B.;Shahnavaz, Shameema;Rao, Jyothi H.;Marakala, Vijaya
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.3
    • /
    • pp.2097-2100
    • /
    • 2013
  • Background: To estimate the numbers and trends in cervix cancer cases visiting the Radiotherapy Department at Manipal Teaching Hospital, Pokhara, Nepal, statistical modelling from retrospective data was applied. Materials and Methods: A retrospective study was carried out on data for a total of 159 patients treated for cervix cancer at Manipal Teaching Hospital, Pokhara, Nepal, between $28^{th}$ September 2000 and $31^{st}$ December 2008. Theoretical statistics were used for statistical modelling and forecasting. Results: Using curve fitting method, Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power and Exponential growth models were validated. Including the constant term, none of the models fit the data well. Excluding the constant term, the cubic model demonstrated the best fit, with $R^2$=0.871 (p=0.004). In 2008, the observed and estimated numbers of cases were same (12). According to our model, 273 patients with cervical cancer are expected to visit the hospital in 2015. Conclusions: Our data predict a significant increase in cervical cancer cases in this region in the near future. This observation suggests the need for more focus and resource allocation on cervical cancer screening and treatment.

Prediction of spatio-temporal AQI data

  • KyeongEun Kim;MiRu Ma;KyeongWon Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.2
    • /
    • pp.119-133
    • /
    • 2023
  • With the rapid growth of the economy and fossil fuel consumption, the concentration of air pollutants has increased significantly and the air pollution problem is no longer limited to small areas. We conduct statistical analysis with the actual data related to air quality that covers the entire of South Korea using R and Python. Some factors such as SO2, CO, O3, NO2, PM10, precipitation, wind speed, wind direction, vapor pressure, local pressure, sea level pressure, temperature, humidity, and others are used as covariates. The main goal of this paper is to predict air quality index (AQI) spatio-temporal data. The observations of spatio-temporal big datasets like AQI data are correlated both spatially and temporally, and computation of the prediction or forecasting with dependence structure is often infeasible. As such, the likelihood function based on the spatio-temporal model may be complicated and some special modelings are useful for statistically reliable predictions. In this paper, we propose several methods for this big spatio-temporal AQI data. First, random effects with spatio-temporal basis functions model, a classical statistical analysis, is proposed. Next, neural networks model, a deep learning method based on artificial neural networks, is applied. Finally, random forest model, a machine learning method that is closer to computational science, will be introduced. Then we compare the forecasting performance of each other in terms of predictive diagnostics. As a result of the analysis, all three methods predicted the normal level of PM2.5 well, but the performance seems to be poor at the extreme value.

The Forecasting Model of the Repair Cost in Apartment Housing - Focused roof water proofing and Elevator work - (공동주택 공종별 수선비용 예측모델 연구 - 옥상방수 공사와 승강기 공사를 중심으로 -)

  • Lee, KangHee;Chae, ChangU
    • KIEAE Journal
    • /
    • v.15 no.6
    • /
    • pp.63-68
    • /
    • 2015
  • Purpose: Most if buildings need various repair works for preventing or delaying the deterioration which gives rise to affect the living condition or function after constructed. Therefore, a long-term repair schedule should be planned and a repair cost is required. In this paper, it aimed at providing the statistical forecast model for a repair cost in roof water-proofing work and elevator work using statistical approach with three variables such as number of household, management area and a elapsed year. Data are collected in apartment housings which are located in Seoul area and conducted with interview and questionnaire sheet. Each analyzed work is divided into a partly work and fully work. Results of this study are shown that, first, the regression model takes a multiplying type like a Cobb-Douglas function and is changed into the log-linear type to include the three variable simultaneously. Second, the goodness-of-fit of the repair cost forecasting model has a good statistics in determinant's coefficient and Dubin-Watson value. Third, the management area is stronger factor than other the number of household and an elapsed year in roof water-proofing work and elevator work.

A study on forecasting of consumers' choice using artificial neural network (인공신경망을 이용한 소비자 선택 예측에 관한 연구)

  • 송수섭;이의훈
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.26 no.4
    • /
    • pp.55-70
    • /
    • 2001
  • Artificial neural network(ANN) models have been widely used for the classification problems in business such as bankruptcy prediction, credit evaluation, etc. Although the application of ANN to classification of consumers' choice behavior is a promising research area, there have been only a few researches. In general, most of the researches have reported that the classification performance of the ANN models were better than conventional statistical model Because the survey data on consumer behavior may include much noise and missing data, ANN model will be more robust than conventional statistical models welch need various assumptions. The purpose of this paper is to study the potential of the ANN model for forecasting consumers' choice behavior based on survey data. The data was collected by questionnaires to the shoppers of department stores and discount stores. Then the correct classification rates of the ANN models for the training and test sample with that of multiple discriminant analysis(MDA) and logistic regression(Logit) model. The performance of the ANN models were betted than the performance of the MDA and Logit model with respect to correct classification rate. By using input variables identified as significant in the stepwise MDA, the performance of the ANN models were improved.

  • PDF

Regression models based on cumulative data for forecasting of new product (신제품 수요예측을 위하여 누적자료를 활용한 회귀모형에 관한 연구)

  • Park, Sang-Gue;Oh, Jung-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.1
    • /
    • pp.117-124
    • /
    • 2009
  • If time series data with seasonal effect exist, various statistical models like winters for successful forecasts could be used. But if the data are not enough to estimate seasonal effect, not much methods are available. This paper proposes the statistical forecasting method based on cumulative data when the data are not enough to estimate seasonal effect. We apply this method to real cosmetic sales data and show its better performance over moving average method.

  • PDF

The Comparison of Imputation Methods in Time Series Data with Missing Values (시계열자료에서 결측치 추정방법의 비교)

  • Lee, Sung-Duck;Choi, Jae-Hyuk;Kim, Duck-Ki
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.4
    • /
    • pp.723-730
    • /
    • 2009
  • Missing values in time series can be treated as unknown parameters and estimated by maximum likelihood or as random variables and predicted by the expectation of the unknown values given the data. The purpose of this study is to impute missing values which are regarded as the maximum likelihood estimator and random variable in incomplete data and to compare with two methods using ARMA model. For illustration, the Mumps data reported from the national capital region monthly over the years 2001 ${\sim}$ 2006 are used, and results from two methods are compared with using SSF(Sum of square for forecasting error).

Forecasting Korean housing price index: application of the independent component analysis (부동산 매매지수와 전세지수 예측: 독립성분분석을 활용한 분석)

  • Pak, Ro Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.2
    • /
    • pp.271-280
    • /
    • 2017
  • Real-estate values and related economics are often the first read newspaper category. We are concerned about the opinions of experts on the forecast for real estate prices. The Box-Jenkins ARIMA model is a commonly used statistical method to predict housing prices. In this article, we tried to predict housing prices by combining independent component analysis (ICA) in multivariate data analysis and the Box-Jenkins ARIMA model. The two independent components for both the selling price index and the long-term rental price index were extracted and used to predict the future values of both indices. In conclusion, it has been shown that the actual indices and the forecast indices using ICA are more comparable to the forecasts of the ARIMA model alone.

Development of Marine Casualty Forecasting System (III): Implementation of Three-Dimensional Visualization System (해양사고 예보 시스템 개발 (III): 3차원 통계 가시화 시스템 구축)

  • Yim, Jeong-Bin
    • Journal of Navigation and Port Research
    • /
    • v.28 no.1
    • /
    • pp.17-22
    • /
    • 2004
  • The paper describes implementation of three-dimensional visualization system that is to provide comprehensive meaning of the statistical prediction results on the marine casualties. Graphical User Interface (GUI) and Web based Virtual Reality (VR) technology are mainly introduced in the system development. To provide daily forecasting, time based casualty prediction model and risk level index are developed in this work. As operating test results of the system, complicated statistical meaning can be shown in the three-dimensional virtual space using simple color. In addition, daily risk levels can be shown on the bar-graph.

Comparison of Different Multiple Linear Regression Models for Real-time Flood Stage Forecasting (실시간 수위 예측을 위한 다중선형회귀 모형의 비교)

  • Choi, Seung Yong;Han, Kun Yeun;Kim, Byung Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.32 no.1B
    • /
    • pp.9-20
    • /
    • 2012
  • Recently to overcome limitations of conceptual, hydrological and physics based models for flood stage forecasting, multiple linear regression model as one of data-driven models have been widely adopted for forecasting flood streamflow(stage). The objectives of this study are to compare performance of different multiple linear regression models according to regression coefficient estimation methods and determine most effective multiple linear regression flood stage forecasting models. To do this, the time scale was determined through the autocorrelation analysis of input data and different flood stage forecasting models developed using regression coefficient estimation methods such as LS(least square), WLS(weighted least square), SPW(stepwise) was applied to flood events in Jungrang stream. To evaluate performance of established models, fours statistical indices were used, namely; Root mean square error(RMSE), Nash Sutcliffe efficiency coefficient (NSEC), mean absolute error (MAE), adjusted coefficient of determination($R^{*2}$). The results show that the flood stage forecasting model using SPW(stepwise) parameter estimation can carry out the river flood stage prediction better in comparison with others, and the flood stage forecasting model using LS(least square) parameter estimation is also found to be slightly better than the flood stage forecasting model using WLS(weighted least square) parameter estimation.

A comparison of deep-learning models to the forecast of the daily solar flare occurrence using various solar images

  • Shin, Seulki;Moon, Yong-Jae;Chu, Hyoungseok
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.42 no.2
    • /
    • pp.61.1-61.1
    • /
    • 2017
  • As the application of deep-learning methods has been succeeded in various fields, they have a high potential to be applied to space weather forecasting. Convolutional neural network, one of deep learning methods, is specialized in image recognition. In this study, we apply the AlexNet architecture, which is a winner of Imagenet Large Scale Virtual Recognition Challenge (ILSVRC) 2012, to the forecast of daily solar flare occurrence using the MatConvNet software of MATLAB. Our input images are SOHO/MDI, EIT $195{\AA}$, and $304{\AA}$ from January 1996 to December 2010, and output ones are yes or no of flare occurrence. We consider other input images which consist of last two images and their difference image. We select training dataset from Jan 1996 to Dec 2000 and from Jan 2003 to Dec 2008. Testing dataset is chosen from Jan 2001 to Dec 2002 and from Jan 2009 to Dec 2010 in order to consider the solar cycle effect. In training dataset, we randomly select one fifth of training data for validation dataset to avoid the over-fitting problem. Our model successfully forecasts the flare occurrence with about 0.90 probability of detection (POD) for common flares (C-, M-, and X-class). While POD of major flares (M- and X-class) forecasting is 0.96, false alarm rate (FAR) also scores relatively high(0.60). We also present several statistical parameters such as critical success index (CSI) and true skill statistics (TSS). All statistical parameters do not strongly depend on the number of input data sets. Our model can immediately be applied to automatic forecasting service when image data are available.

  • PDF