• 제목/요약/키워드: Regression Statistical Analysis

검색결과 3,457건 처리시간 0.033초

How to identify fake images? : Multiscale methods vs. Sherlock Holmes

  • Park, Minsu;Park, Minjeong;Kim, Donghoh;Lee, Hajeong;Oh, Hee-Seok
    • Communications for Statistical Applications and Methods
    • /
    • 제28권6호
    • /
    • pp.583-594
    • /
    • 2021
  • In this paper, we propose wavelet-based procedures to identify the difference between images, including portraits and handwriting. The proposed methods are based on a novel combination of multiscale methods with a regularization technique. The multiscale method extracts the local characteristics of an image, and the distinct features are obtained through the regularized regression of the local characteristics. The regularized regression approach copes with the high-dimensional problem to build the relation between the local characteristics. Lytle and Yang (2006) introduced the detection method of forged handwriting via wavelets and summary statistics. We expand the scope of their method to the general image and significantly improve the results. We demonstrate the promising empirical evidence of the proposed method through various experiments.

Regression Analysis of Longitudinal Data Based on M-estimates

  • Jung, Sin-Ho;Terry M. Therneau
    • Journal of the Korean Statistical Society
    • /
    • 제29권2호
    • /
    • pp.201-217
    • /
    • 2000
  • The method of generalized estimating equations (GEE) has become very popular for the analysis of longitudinal data. We extend this work to the use of M-estimators; the resultant regression estimates are robust to heavy tailed errors and to outliers. The proposed method does not require correct specification of the dependence structure between observation, and allows for heterogeneity of the error. However, an estimate of the dependence structure may be incorporated, and if it is correct this guarantees a higher efficiency for the regression estimators. A goodness-of-fit test for checking the adequacy of the assumed M-estimation regression model is also provided. Simulation studies are conducted to show the finite-sample performance of the new methods. The proposed methods are applied to a real-life data set.

  • PDF

A Comparative Study on the Performance of Bayesian Partially Linear Models

  • Woo, Yoonsung;Choi, Taeryon;Kim, Wooseok
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.885-898
    • /
    • 2012
  • In this paper, we consider Bayesian approaches to partially linear models, in which a regression function is represented by a semiparametric additive form of a parametric linear regression function and a nonparametric regression function. We make a comparative study on the performance of widely used Bayesian partially linear models in terms of empirical analysis. Specifically, we deal with three Bayesian methods to estimate the nonparametric regression function, one method using Fourier series representation, the other method based on Gaussian process regression approach, and the third method based on the smoothness of the function and differencing. We compare the numerical performance of three methods by the root mean squared error(RMSE). For empirical analysis, we consider synthetic data with simulation studies and real data application by fitting each of them with three Bayesian methods and comparing the RMSEs.

Regression discontinuity for survival data

  • Youngjoo Cho
    • Communications for Statistical Applications and Methods
    • /
    • 제31권1호
    • /
    • pp.155-178
    • /
    • 2024
  • Regression discontinuity (RD) design is one of the most widely used methods in causal inference for estimation of treatment effect when the treatment is created by a cutpoint from the covariate of interest. There has been little attention to RD design, although it provides a very useful tool for analysis of treatment effect for censored data. In this paper, we define the causal effect for survival function in RD design when the treatment is assigned deterministically by the covariate of interest. We propose estimators of this causal effect for survival data by using transformation, which leads unbiased estimator of the survival function with local linear regression. Simulation studies show the validity of our approach. We also illustrate our proposed method using the prostate, lung, colorectal and ovarian (PLCO) dataset.

ANCOVA 모형을 위한 DD-plot (DD-Plot for ANCOVA Models)

  • 장대흥
    • 응용통계연구
    • /
    • 제27권2호
    • /
    • pp.227-237
    • /
    • 2014
  • 우리는 회귀분석에서 설명변수들 중 일부가 질적 변수인 경우 지시변수를 사용한다. 또한 공분산분석모형에서는 관심인자의 효과에 대한 유의성 검정시 연속변수인 공변수로 주어지는 방해인자를 미리 회귀분석으로 제거한다. 지시변수 사용 회귀모형이나 공분산분석모형을 위한 확증적 자료분석 전에 탐색적 자료분석의 한 수단으로서 자료깊이에 근거한 DD-plot을 이용하면 집단 간의 차이를 쉽게 알아볼 수 있다. 이 방법은 오차항의 통계모형을 가정하지 않으므로 유용한 탐색적 방법이 될 수 있다. 몇 가지 사례들을 통하여 DD-plot이 지시변수 사용 회귀모형이나 공분산분석모형을 위한 그래픽 탐색적 자료분석방법으로서 유용함을 보였다.

Optimal fractions in terms of a prediction-oriented measure

  • Lee, Won-Woo
    • Journal of the Korean Statistical Society
    • /
    • 제22권2호
    • /
    • pp.209-217
    • /
    • 1993
  • The multicollinearity problem in a multiple linear regression model may present deleterious effects on predictions. Thus, its is desirable to consider the optimal fractions with respect to the unbiased estimate of the mean squares errors of the predicted values. Interstingly, the optimal fractions can be also illuminated by the Bayesian inerpretation of the general James-Stein estimators.

  • PDF

통계처리를 활용한 터널 내공변위의 분석에 관한 연구 (Estimation of Tunnel Convergence Using Statistical Analysis)

  • 김종우
    • 터널과지하공간
    • /
    • 제13권2호
    • /
    • pp.108-116
    • /
    • 2003
  • 백악기 경상계 안산암과 불국사 화강암류가 주로 분포하는 지반에서 시공된 터널의 내공변위 계측자료를 분석하였다. 터널 주변 암반을 RMR법에 의한 다섯 가지 암반등급으로 구분하고 각 등급에 포함된 계측자료들을 통계처리하여 암반등급별 내공변위의 회귀분석을 실시하였다. 연구 결과. 로그함수보다는 지수함수의 상관계수 가 더 크며, 연약한 암반등급일수록 내공변위의 크기와 표준편차가 크게 나타났다. 또한, 최종내공변위에 대한 최대변위속도 및 초기내공변위의 관계를 도출하였으며, 이 중에서 최종내공변위와 최대변위속도의 상관계수는 0.87로 나타나 이들은 비교적 높은 상관성을 가지는 것으로 확인되었다

Bayesian quantile regression analysis of Korean Jeonse deposit

  • Nam, Eun Jung;Lee, Eun Kyung;Oh, Man-Suk
    • Communications for Statistical Applications and Methods
    • /
    • 제25권5호
    • /
    • pp.489-499
    • /
    • 2018
  • Jeonse is a unique property rental system in Korea in which a tenant pays a part of the price of a leased property as a fixed amount security deposit and gets back the entire deposit when the tenant moves out at the end of the tenancy. Jeonse deposit is very important in the Korean real estate market since it is directly related to the residential property sales price and it is a key indicator to predict future real estate market trend. Jeonse deposit data shows a skewed and heteroscedastic distribution and the commonly used mean regression model may be inappropriate for the analysis of Jeonse deposit data. In this paper, we apply a Bayesian quantile regression model to analyze Jeonse deposit data, which is non-parametric and does not require any distributional assumptions. Analysis results show that the quantile regression coefficients of most explanatory variables change dramatically for different quantiles. The regression coefficients of some variables have different signs for different quantiles, implying that even the same variable may affect the Jeonse deposit in the opposite direction depending on the amount of deposit.

A Statistical Approach to Examine the Impact of Various Meteorological Parameters on Pan Evaporation

  • Pandey, Swati;Kumar, Manoj;Chakraborty, Soubhik;Mahanti, N.C.
    • 응용통계연구
    • /
    • 제22권3호
    • /
    • pp.515-530
    • /
    • 2009
  • Evaporation from surface water bodies is influenced by a number of meteorological parameters. The rate of evaporation is primarily controlled by incoming solar radiation, air and water temperature and wind speed and relative humidity. In the present study, influence of weekly meteorological variables such as air temperature, relative humidity, bright sunshine hours, wind speed, wind velocity, rainfall on rate of evaporation has been examined using 35 years(1971-2005) of meteorological data. Statistical analysis was carried out employing linear regression models. The developed regression models were tested for goodness of fit, multicollinearity along with normality test and constant variance test. These regression models were subsequently validated using the observed and predicted parameter estimates with the meteorological data of the year 2005. Further these models were checked with time order sequence of residual plots to identify the trend of the scatter plot and then new standardized regression models were developed using standardized equations. The highest significant positive correlation was observed between pan evaporation and maximum air temperature. Mean air temperature and wind velocity have highly significant influence on pan evaporation whereas minimum air temperature, relative humidity and wind direction have no such significant influence.

Residuals Plots for Repeated Measures Data

  • 박태성
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2000년도 추계학술발표회 논문집
    • /
    • pp.187-191
    • /
    • 2000
  • In the analysis of repeated measurements, multivariate regression models that account for the correlations among the observations from the same subject are widely used. Like the usual univariate regression models, these multivariate regression models also need some model diagnostic procedures. In this paper, we propose a simple graphical method to detect outliers and to investigate the goodness of model fit in repeated measures data. The graphical method is based on the quantile-quantile(Q-Q) plots of the $X^2$ distribution and the standard normal distribution. We also propose diagnostic measures to detect influential observations. The proposed method is illustrated using two examples.

  • PDF