• 제목/요약/키워드: Cross validation technique

Search Result 126, Processing Time 0.09 seconds

Airline In-flight Meal Demand Forecasting with Neural Networks and Time Series Models

  • Lee, Young-Chan
    • Proceedings of the Korea Association of Information Systems Conference
    • /
    • 2000.11a
    • /
    • pp.36-44
    • /
    • 2000
  • The purpose of this study is to introduce a more efficient forecasting technique, which could help result the reduction of cost in removing the waste of airline in-flight meals. We will use a neural network approach known to many researchers as the “Outstanding Forecasting Technique”. We employed a multi-layer perceptron neural network using a backpropagation algorithm. We also suggested using other related information to improve the forecasting performances of neural networks. We divided the data into three sets, which are training data set, cross validation data set, and test data set. Time lag variables are still employed in our model according to the general view of time series forecasting. We measured the accuracy of our model by “Mean Square Error”(MSE). The suggested model proved most excellent in serving economy class in-flight meals. Forecasting the exact amount of meals needed for each airline could reduce the waste of meals and therefore, lead to the reduction of cost. Better yet, it could enhance the cost competition of each airline, keep the schedules on time, and lead to better service.

  • PDF

Boosting Multifactor Dimensionality Reduction Using Pre-evaluation

  • Hong, Yingfu;Lee, Sangbum;Oh, Sejong
    • ETRI Journal
    • /
    • v.38 no.1
    • /
    • pp.206-215
    • /
    • 2016
  • The detection of gene-gene interactions during genetic studies of common human diseases is important, and the technique of multifactor dimensionality reduction (MDR) has been widely applied to this end. However, this technique is not free from the "curse of dimensionality" -that is, it works well for two- or three-way interactions but requires a long execution time and extensive computing resources to detect, for example, a 10-way interaction. Here, we propose a boosting method to reduce MDR execution time. With the use of pre-evaluation measurements, gene sets with low levels of interaction can be removed prior to the application of MDR. Thus, the problem space is decreased and considerable time can be saved in the execution of MDR.

Forecasting of Seasonal Inflow to Reservoir Using Multiple Linear Regression (다중선형회귀분석에 의한 계절별 저수지 유입량 예측)

  • Kang, Jaewon
    • Journal of Environmental Science International
    • /
    • v.22 no.8
    • /
    • pp.953-963
    • /
    • 2013
  • Reliable long-term streamflow forecasting is invaluable for water resource planning and management which allocates water supply according to the demand of water users. Forecasting of seasonal inflow to Andong dam is performed and assessed using statistical methods based on hydrometeorological data. Predictors which is used to forecast seasonal inflow to Andong dam are selected from southern oscillation index, sea surface temperature, and 500 hPa geopotential height data in northern hemisphere. Predictors are selected by the following procedure. Primary predictors sets are obtained, and then final predictors are determined from the sets. The primary predictor sets for each season are identified using cross correlation and mutual information. The final predictors are identified using partial cross correlation and partial mutual information. In each season, there are three selected predictors. The values are determined using bootstrapping technique considering a specific significance level for predictor selection. Seasonal inflow forecasting is performed by multiple linear regression analysis using the selected predictors for each season, and the results of forecast using cross validation are assessed. Multiple linear regression analysis is performed using SAS. The results of multiple linear regression analysis are assessed by mean squared error and mean absolute error. And contingency table is established and assessed by Heidke skill score. The assessment reveals that the forecasts by multiple linear regression analysis are better than the reference forecasts.

A Basic Study on the Improvement of Leakage Error of the Acoustic Intensity (음향 인텐시티의 누설오차 개선에 관한 기초적 연구)

  • 정의봉;정호경;안세진;윤상돈
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.5
    • /
    • pp.345-350
    • /
    • 2003
  • Acoustic intensity is usually estimated by the cross-spectrum of acoustic pressure at two adjacent microphones. The cross-spectrum calculated by digital Fourier transform technique will unavoidably have leakage error since the period of signal will not be usually coincident with record length. Therefore, the acoustic intensity estimated by the conventional FFT analyzer will show distorted value. In this paper, the expression of the Fourier transformed data of a harmonic signal with a single frequency is formulated when there is leakage error. The method to eliminate the effect of leakage error from the contaminated data is also proposed. Some numerical examples show the validation of the proposed method.

Modified parity space averaging approaches for online cross-calibration of redundant sensors in nuclear reactors

  • Kassim, Moath;Heo, Gyunyoung
    • Nuclear Engineering and Technology
    • /
    • v.50 no.4
    • /
    • pp.589-598
    • /
    • 2018
  • To maintain safety and reliability of reactors, redundant sensors are usually used to measure critical variables and estimate their averaged time-dependency. Nonhealthy sensors can badly influence the estimation result of the process variable. Since online condition monitoring was introduced, the online cross-calibration method has been widely used to detect any anomaly of sensor readings among the redundant group. The cross-calibration method has four main averaging techniques: simple averaging, band averaging, weighted averaging, and parity space averaging (PSA). PSA is used to weigh redundant signals based on their error bounds and their band consistency. Using the consistency weighting factor (C), PSA assigns more weight to consistent signals that have shared bands, based on how many bands they share, and gives inconsistent signals of very low weight. In this article, three approaches are introduced for improving the PSA technique: the first is to add another consistency factor, so called trend consistency (TC), to include a consideration of the preserving of any characteristic edge that reflects the behavior of equipment/component measured by the process parameter; the second approach proposes replacing the error bound/accuracy based weighting factor ($W^a$) with a weighting factor based on the Euclidean distance ($W^d$), and the third approach proposes applying $W^d$, TC, and C, all together. Cold neutron source data sets of four redundant hydrogen pressure transmitters from a research reactor were used to perform the validation and verification. Results showed that the second and third modified approaches lead to reasonable improvement of the PSA technique. All approaches implemented in this study were similar in that they have the capability to (1) identify and isolate a drifted sensor that should undergo calibration, (2) identify a faulty sensor/s due to long and continuous missing data range, and (3) identify a healthy sensor.

Three-dimensional geostatistical modeling of subsurface stratification and SPT-N Value at dam site in South Korea

  • Mingi Kim;Choong-Ki Chung;Joung-Woo Han;Han-Saem Kim
    • Geomechanics and Engineering
    • /
    • v.34 no.1
    • /
    • pp.29-41
    • /
    • 2023
  • The 3D geospatial modeling of geotechnical information can aid in understanding the geotechnical characteristic values of the continuous subsurface at construction sites. In this study, a geostatistical optimization model for the three-dimensional (3D) mapping of subsurface stratification and the SPT-N value based on a trial-and-error rule was developed and applied to a dam emergency spillway site in South Korea. Geospatial database development for a geotechnical investigation, reconstitution of the target grid volume, and detection of outliers in the borehole dataset were implemented prior to the 3D modeling. For the site-specific subsurface stratification of the engineering geo-layer, we developed an integration method for the borehole and geophysical survey datasets based on the geostatistical optimization procedure of ordinary kriging and sequential Gaussian simulation (SGS) by comparing their cross-validation-based prediction residuals. We also developed an optimization technique based on SGS for estimating the 3D geometry of the SPT-N value. This method involves quantitatively testing the reliability of SGS and selecting the realizations with a high estimation accuracy. Boring tests were performed for validation, and the proposed method yielded more accurate prediction results and reproduced the spatial distribution of geotechnical information more effectively than the conventional geostatistical approach.

Non-destructive quality prediction of truss tomatoes using hyperspectral reflectance imagery (초분광 영상을 이용한 송이토마토의 비파괴 품질 예측)

  • Kim, Dae-Yong;Cho, Byoung-Kwan;Kim, Young-Sik
    • Korean Journal of Agricultural Science
    • /
    • v.39 no.3
    • /
    • pp.413-420
    • /
    • 2012
  • Spectroscopic measurement method based on visible and near-infrared wavelengths was prominent technology for rapid and non-destructive evaluation of internal quality of fruits. Reflectance measurement was performed to evaluate firmness, soluble solid content, and acid content of truss tomatoes by hyperspectral reflectance imaging system. The Vis/NIR reflectance spectra was acquired from truss tomatoes sorted by 6 ripening stages. The multivariable analysis based on partial least square (PLS) was used to develop regression models with several preporcessing methods, such as smoothing, normalization, multiplicative scatter correction (MSC), and standard normal variate (SNV). The best model was selected in terms of coefficient of determination of calibration ($R_c^2$) and full cross validation ($R_{cv}^2$), and root mean standard error of calibration (RMSEC) and full cross validation (RMSECV). The results of selected models were 0.8976 ($R_p^2$), 6.0207 kgf (RMSEP) with gaussian filter of smoothing, 0.8379 ($R_p^2$), $0.2674^{\circ}Bx$ (RMSEP) with the mean of normalization, and 0.7779 ($R_p^2$), 0.1033% (RMSEP) with median filter of smoothing for firmness, soluble solid content (SSC), and acid content, respectively. Results show that Vis / NIR hyperspectral reflectance imaging technique has good potential for the measurement of internal quality of truss tomato.

Power Consumption Forecasting Scheme for Educational Institutions Based on Analysis of Similar Time Series Data (유사 시계열 데이터 분석에 기반을 둔 교육기관의 전력 사용량 예측 기법)

  • Moon, Jihoon;Park, Jinwoong;Han, Sanghoon;Hwang, Eenjun
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.954-965
    • /
    • 2017
  • A stable power supply is very important for the maintenance and operation of the power infrastructure. Accurate power consumption prediction is therefore needed. In particular, a university campus is an institution with one of the highest power consumptions and tends to have a wide variation of electrical load depending on time and environment. For this reason, a model that can accurately predict power consumption is required for the effective operation of the power system. The disadvantage of the existing time series prediction technique is that the prediction performance is greatly degraded because the width of the prediction interval increases as the difference between the learning time and the prediction time increases. In this paper, we first classify power data with similar time series patterns considering the date, day of the week, holiday, and semester. Next, each ARIMA model is constructed based on the classified data set and a daily power consumption forecasting method of the university campus is proposed through the time series cross-validation of the predicted time. In order to evaluate the accuracy of the prediction, we confirmed the validity of the proposed method by applying performance indicators.

Prediction of box office using data mining (데이터마이닝을 이용한 박스오피스 예측)

  • Jeon, Seonghyeon;Son, Young Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1257-1270
    • /
    • 2016
  • This study deals with the prediction of the total number of movie audiences as a measure for the box office. Prediction is performed by classification techniques of data mining such as decision tree, multilayer perceptron(MLP) neural network model, multinomial logit model, and support vector machine over time such as before movie release, release day, after release one week, and after release two weeks. Predictors used are: online word-of-mouth(OWOM) variables such as the portal movie rating, the number of the portal movie rater, and blog; in addition, other variables include showing the inherent properties of the film (such as nationality, grade, release month, release season, directors, actors, distributors, the number of audiences, and screens). When using 10-fold cross validation technique, the accuracy of the neural network model showed more than 90 % higher predictability before movie release. In addition, it can be seen that the accuracy of the prediction increases by adding estimates of the final OWOM variables as predictors.

A New Support Vector Machine Model Based on Improved Imperialist Competitive Algorithm for Fault Diagnosis of Oil-immersed Transformers

  • Zhang, Yiyi;Wei, Hua;Liao, Ruijin;Wang, Youyuan;Yang, Lijun;Yan, Chunyu
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.2
    • /
    • pp.830-839
    • /
    • 2017
  • Support vector machine (SVM) is introduced as an effective fault diagnosis technique based on dissolved gases analysis (DGA) for oil-immersed transformers with maximum generalization ability; however, the applicability of the SVM is highly affected due to the difficulty of selecting the SVM parameters appropriately. Therefore, a novel approach combing SVM with improved imperialist competitive algorithm (IICA) for fault diagnosis of oil-immersed transformers was proposed in the paper. The improved ICA, which is proved to be an effective optimization approach, is employed to optimize the parameters of SVM. Cross validation and normalizations were applied in the training processes of SVM and the trained SVM model with the optimized parameters was established for fault diagnosis of oil-immersed transformers. Three classification benchmark sets were studied based on particle swarm optimization SVM (PSOSVM) and IICASVM with four multiple classification schemes to select the best scheme for transformer fault diagnosis. The results show that the proposed model can obtain higher diagnosis accuracy than other methods. The comparisons confirm that the proposed model is an effective approach for classification problems.