• Title/Summary/Keyword: 응용절사법

Search Result 17, Processing Time 0.019 seconds

Robust Response Transformation Using Outlier Detection in Regression Model (회귀모형에서 이상치 검색을 이용한 로버스트 변수변환방법)

  • Seo, Han-Son;Lee, Ga-Yoen;Yoon, Min
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.1
    • /
    • pp.205-213
    • /
    • 2012
  • Transforming response variable is a general tool to adapt data to a linear regression model. However, it is well known that response transformations in linear regression are very sensitive to one or a few outliers. Many methods have been suggested to develop transformations that will not be influenced by potential outliers. Recently Cheng (2005) suggested to using a trimmed likelihood estimator based on the idea of the least trimmed squares estimator(LTS). However, the method requires presetting the number of outliers and needs many computations. A new method is proposed, that can solve the problems addressed and improve the robustness of the estimates. The method uses a stepwise procedure, suggested by Hadi and Simonoff (1993), to detect outliers that determine response transformations.

A Sample Design for the Statistical Survey of E-commerce (전자상거래 통계조사 표본설계)

  • 이기성;홍기학;손창균
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.393-402
    • /
    • 2004
  • We suggest a summary of sample design for the statistical survey of E-commerce to estimate the general trend as well as the market size and investment scale of e-commerce based on the Census on Basic Characteristic of Establishments(CBCE). The sample design of this survey is composed of systematic sampling and modified cut-off method.

A robust test for the parallelism of two regression lines (두 회귀직선의 평행성에 대한 로버스트 검정)

  • 남호수;송문섭;신봉섭
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.2
    • /
    • pp.77-86
    • /
    • 1995
  • For the problem of testing the parallelism of two regression lines, a robust procedure is proposed and examined. The proposed test statistic is based on the one-step GM-estimators of slope parameters proposed by Song et al. (1994b). These GM-estimators used the Least Trimmed Squares estimates as an initial values so as to obtain high breakdown point. Through a small-sample Monte Carlo simulation the empirical levels and powers of the proposed test are compared with other tests under various error distributions.

  • PDF

A procedure for simultaneous variable selection, variable transformation and outlier identification in linear regression (선형회귀에서 변수선택, 변수변환과 이상치 탐지의 동시적 수행을 위한 절차)

  • Seo, Han Son;Yoon, Min
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.1-10
    • /
    • 2020
  • We propose a unified approach to variable selection, transformation and outliers in the linear model. The procedure includes a sequential method for outlier detection and a least trimmed squares estimator for variable transformation. It uses all possible subsets regressions for model selection. Some real data analyses and the simulation results are provided to show the efficiency of the methods in the context of the correct variable selection and the fitness of the estimated model.

Shrinkage Small Area Estimation Using a Semiparametric Mixed Model (준모수혼합모형을 이용한 축소소지역추정)

  • Jeong, Seok-Oh;Choo, Manho;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.605-617
    • /
    • 2014
  • Small area estimation is a statistical inference method to overcome large variance due to a small sample size allocated in a small area. A shrinkage estimator obtained by minimizing relative error(RE) instead of MSE has been suggested. The estimator takes advantage of good interpretation when the data range is large. A semiparametric estimator is also studied for small area estimation. In this study, we suggest a semiparametric shrinkage small area estimator and compare small area estimators using labor statistics.

A study on non-response bias adjusted estimation for take-all stratum (전수층 무응답 편향보정 추정법에 관한 연구)

  • Chung, Hee Young;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.4
    • /
    • pp.409-420
    • /
    • 2020
  • In business survey, modified cut-off sampling is commonly used to greatly increase the accuracy of the estimation while reducing the number of samples. However, non-response rate of take-all stratum has increased significantly and the sample substitution is not possible because the non-response in the take-all stratum affects the accuracy of the estimation. It is important to adjust the bias appropriately if non-response is affected by the variable of interest. In this study, a bias adjusted estimation is proposed as an appropriate method to deal with a non-response in the take-all stratum. In particular, the estimator proposed by Chung and Shin (2020) was applied to the bias adjustment for the take-all stratum; therefore, we suggest a new method to adjust properly for the take-all stratum. The superiority of the proposed estimator was examined through simulation studies and confirmed through actual data analysis.

Analyzing Influence of Outlier Elimination on Accuracy of Software Effort Estimation (소프트웨어 공수 예측의 정확성에 대한 이상치 제거의 영향 분석)

  • Seo, Yeong-Seok;Yoon, Kyung-A;Bae, Doo-Hwan
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.10
    • /
    • pp.589-599
    • /
    • 2008
  • Accurate software effort estimation has always been a challenge for the software industrial and academic software engineering communities. Many studies have focused on effort estimation methods to improve the estimation accuracy of software effort. Although data quality is one of important factors for accurate effort estimation, most of the work has not considered it. In this paper, we investigate the influence of outlier elimination on the accuracy of software effort estimation through empirical studies applying two outlier elimination methods(Least trimmed square regression and K-means clustering) and three effort estimation methods(Least squares regression, Neural network and Bayesian network) associatively. The empirical studies are performed using two industry data sets(the ISBSG Release 9 and the Bank data set which consists of the project data collected from a bank in Korea) with or without outlier elimination.