• 제목/요약/키워드: Monte-Carlo least squares

검색결과 61건 처리시간 0.028초

로버스트주성분회귀에서 최적의 주성분선정을 위한 기준 (A Criterion for the Selection of Principal Components in the Robust Principal Component Regression)

  • 김부용
    • Communications for Statistical Applications and Methods
    • /
    • 제18권6호
    • /
    • pp.761-770
    • /
    • 2011
  • 회귀모형에 연관성이 높은 설명변수들이 포함되면 다중공선성의 문제가 야기되며, 동시에 자료에 회귀 이상점들이 포함되면 최소자승추정량에 바탕을 둔 제반 통계적 추론은 심각한 결함을 갖게 된다. 이러한 현상들은 데이터마이닝 분야에서 많이 볼 수 있는데, 본 논문에서는 두 가지 문제를 동시에 해결하기 위한 방안으로서 로버스트주성분회귀를 제안하였다. 특히 최적의 주성분을 선정하기 위한 새로운 기준을 개발하였는데, 설명변수들의 표본공분산 대신에 MVE-추정량을 기반으로 하였으며, 고유치가 아니라 상태지수의 크기에 바탕을 둔 선정기준을 제안하였다. 그리고 주성분모형에서의 추정을 위하여 회귀이상점에 대해 로버스트한 LTS-추정을 도입하였다. 제안된 선정기준이 기존의 기준들보다 다중공선성과 이상점이 유발하는 문제들을 잘 해결할 수 있음을 모의실험을 통하여 확인하였다.

Reliability-based combined high and low cycle fatigue analysis of turbine blade using adaptive least squares support vector machines

  • Ma, Juan;Yue, Peng;Du, Wenyi;Dai, Changping;Wriggers, Peter
    • Structural Engineering and Mechanics
    • /
    • 제83권3호
    • /
    • pp.293-304
    • /
    • 2022
  • In this work, a novel reliability approach for combined high and low cycle fatigue (CCF) estimation is developed by combining active learning strategy with least squares support vector machines (LS-SVM) (named as ALS-SVM) surrogate model to address the multi-resources uncertainties, including working loads, material properties and model itself. Initially, a new active learner function combining LS-SVM approach with Monte Carlo simulation (MCS) is presented to improve computational efficiency with fewer calls to the performance function. To consider the uncertainty of surrogate model at candidate sample points, the learning function employs k-fold cross validation method and introduces the predicted variance to sequentially select sampling. Following that, low cycle fatigue (LCF) loads and high cycle fatigue (HCF) loads are firstly estimated based on the training samples extracted from finite element (FE) simulations, and their simulated responses together with the sample points of model parameters in Coffin-Manson formula are selected as the MC samples to establish ALS-SVM model. In this analysis, the MC samples are substituted to predict the CCF reliability of turbine blades by using the built ALS-SVM model. Through the comparison of the two approaches, it is indicated that the reliability model by linear cumulative damage rule provides a non-conservative result compared with that by the proposed one. In addition, the results demonstrate that ALS-SVM is an effective analysis method holding high computational efficiency with small training samples to gain accurate fatigue reliability.

Unit Root Test for Temporally Aggregated Autoregressive Process

  • Shin, Dong-Wan;Kim, Sung-Chul
    • Journal of the Korean Statistical Society
    • /
    • 제22권2호
    • /
    • pp.271-282
    • /
    • 1993
  • Unit root test for temporally aggregated first order autoregressive process is considered. The temporal aggregate of fist order autoregression is an autoregressive moving average of order (1,1) with moving average parameter being function of the autoregressive parameter. One-step Gauss-Newton estimators are proposed and are shown to have the same limiting distribution as the ordinary least squares estimator for unit root when complete observations are available. A Monte-Carlo simulation shows that the temporal aggregation have no effect on the size. The power of the suggested test are nearly the same as the powers of the test based on complete observations.

  • PDF

두 회귀직선의 평행성에 대한 로버스트 검정 (A robust test for the parallelism of two regression lines)

  • 남호수;송문섭;신봉섭
    • 응용통계연구
    • /
    • 제8권2호
    • /
    • pp.77-86
    • /
    • 1995
  • 본 논문에서는 두 회귀직선의 평행성에 대한 로버스트 검정법을 제안하고, 모의실험과 예를 통하여 기존의 방법들과 유의수준의 안정성 및 검정력의 측면에서 비교하였다. 제안된 검정법은 Song et al. (1994b)에 의하여 제안된 최소절사제곱 추정량을 초기치로 하는 일단계 GM-추정량에 기초를 두고 있다. 이 추정량은 최대붕괴점과 유계영향함수를 갖는 것으로 알려져 있다.

  • PDF

DEA효율성점수의 결정요인 분석방법 비교 (A Comparison of Alternative Approaches to Determinants of DEA Efficiency Scores)

  • 김성호
    • 한국경영과학회지
    • /
    • 제35권2호
    • /
    • pp.19-35
    • /
    • 2010
  • Many papers have used a two-stage approach of first calculating DEA efficiency scores and then seeking to correlate these scores with various environmental variables. Most of the studies have not checked whether such a two-stage approach is statistically valid for identifying significant environmental variables. Recently Simar and Wilson (2007) (SW) introduce a sensible data generating process and bootstrap procedure based on truncated regression for the two-stage approach. Banker and Natarajan (2008) (BN) provide a statistical foundation for the two-stage approach comprising a DEA followed by an ordinary least squares or maximum likelihood estimation. Researchers have to identify an approach suitable for their research circumstances in terms of properties, merits, demerits, and robustness to plausible departures from its chosen data generating process. We summarize the foundations and properties of the two-stage procedures suggested by SW and BN. And we discuss merits and demerits of those procedures. Also using Monte Carlo simulation we assess their relative performance under several misspecified settings.

The Sequential Testing of Multiple Outliers in Linear Regression

  • Park, Jinpyo;Park, Heechang
    • Communications for Statistical Applications and Methods
    • /
    • 제8권2호
    • /
    • pp.337-346
    • /
    • 2001
  • In this paper we consider the problem of identifying and testing the outliers in linear regression. first we consider the problem for testing the null hypothesis of no outliers. The test based on the ratio of two scale estimates is proposed. We show the asymptotic distribution of the test statistic by Monte Carlo simulation and investigate its properties. Next we consider the problem of identifying the outliers. A forward sequential procedure based on the suggested test is proposed and shown to perform fairly well. The forward sequential procedure is unaffected by masking and swamping effects because the test statistic is based on robust estimate.

  • PDF

Dual Generalized Maximum Entropy Estimation for Panel Data Regression Models

  • Lee, Jaejun;Cheon, Sooyoung
    • Communications for Statistical Applications and Methods
    • /
    • 제21권5호
    • /
    • pp.395-409
    • /
    • 2014
  • Data limited, partial, or incomplete are known as an ill-posed problem. If the data with ill-posed problems are analyzed by traditional statistical methods, the results obviously are not reliable and lead to erroneous interpretations. To overcome these problems, we propose a dual generalized maximum entropy (dual GME) estimator for panel data regression models based on an unconstrained dual Lagrange multiplier method. Monte Carlo simulations for panel data regression models with exogeneity, endogeneity, or/and collinearity show that the dual GME estimator outperforms several other estimators such as using least squares and instruments even in small samples. We believe that our dual GME procedure developed for the panel data regression framework will be useful to analyze ill-posed and endogenous data sets.

Nonlinearities and Forecasting in the Economic Time Series

  • Lee, Woo-Rhee
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.931-954
    • /
    • 2003
  • It is widely recognized that economic time series involved not only the linearities but also the non-linearities. In this paper, when the economic time series data have the nonlinear characteristics we propose the forecasts method using combinations of both forecasts from linear and nonlinear models. In empirical study, we compare the forecasting performance of 4 exchange rates models(AR, GARCH, AR+GARCH, Bilinear model) and combination of these forecasts for dairly Won/Dollar exchange rates returns. The combination method is selected by the estimated individual forecast errors using Monte Carlo simulations. And this study shows that the combined forecasts using unrestricted least squares method is performed substantially better than any other combined forecasts or individual forecasts.

유전적 기법에 의한 지구물리자료의 역산 (Inversion of Geophysical Data Using Genetic Algorithms)

  • 김희준
    • 자원환경지질
    • /
    • 제28권4호
    • /
    • pp.425-431
    • /
    • 1995
  • Genetic algorithms are so named because they are analogous to biological processes. The model parameters are coded in binary form. The algorithm then starts with a randomly chosen population of models called chromosomes. The second step is to evaluate the fitness values of these models, measured by a correlation between data and synthetic for a particular model. Then, the three genetic processes of selection, crossover, and mutation are performed upon the model in sequence. Genetic algorithms share the favorable characteristics of random Monte Carlo over local optimization methods in that they do not require linearizing assumptions nor the calculation of partial derivatives, are independent of the misfit criterion, and avoid numerical instabilities associated with matrix inversion. An additional advantage over converntional methods such as iterative least squares is that the sampling is global, rather than local, thereby reducing the tendency to become entrapped in local minima and avoiding the dependency on an assumed starting model.

  • PDF

A Modification of the W Test for Exponentiality

  • Kim, Nam-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • 제8권1호
    • /
    • pp.159-171
    • /
    • 2001
  • Shapiro and Wilk (1972) developed a test for exponentiality with origin and scale unknown. The procedure consists of comparing the generalized least squares estimate of scale with the estimate of scale given by the sample variance. However the test statistic is inconsistent ; that is, the power of the test will not approach 1 as the sample size increases. Hence we give a test based on the ratio of two asymptotically efficient estimates of scale. We also have conducted a power study to compare the test procedures, using Monte Carlo samples from a wide range of alternatives. It is found that the suggested statistics have higher power for the alternatives with the coefficient of variation greater that or equal to 1.

  • PDF