통합 검색 | Korea Science

Simultaneous Identification of Multiple Outliers and High Leverage Points in Linear Regression

Rahmatullah Imon, A.H.M.;Ali, M. Masoom
- Journal of the Korean Data and Information Science Society
- /
- 제16권2호
- /
- pp.429-444
- /
- 2005
The identification of unusual observations such as outliers and high leverage points has drawn a great deal of attention for many years. Most of these identifications techniques are based on case deletion that focuses more on the outliers than the high leverage points. But residuals together with leverage values may cause masking and swamping for which a good number of unusual observations remain undetected in the presence of multiple outliers and multiple high leverage points. In this paper we propose a new procedure to identify outliers and high leverage points simultaneously. We suggest an additive form of the residuals and the leverages that gives almost an equal focus on outliers and leverages. We analyzed several well-referred data set and discover few outliers and high leverage points that were undetected by the existing diagnostic techniques.
PDF

유전자 알고리듬을 이용한 다중이상치 탐색

고영현;이혜선;전치혁
- 한국통계학회:학술대회논문집
- /
- 한국통계학회 2000년도 추계학술발표회 논문집
- /
- pp.173-179
- /
- 2000
Genetic algorithm(GA) is applied for detecting multiple outliers. GA is a heuristic optimization tool solving for near optimal solution. We compare the performance of GA and the other diagnostic measures commonly used for detecting outliers in regression model. The results show that GA seems to have better performance than the others for the detection of multiple outliers.
PDF

Computational Methods for Detection of Multiple Outliers in Nonlinear Regression

Myung-Wook Kahng
- Communications for Statistical Applications and Methods
- /
- 제3권2호
- /
- pp.1-11
- /
- 1996
The detection of multiple outliers in nonlinear regression models can be computationally not feasible. As a compromise approach, we consider the use of simulated annealing algorithm, an approximate approach to combinatorial optimization. We show that this method ensures convergence and works well in locating multiple outliers while reducing computational time.
PDF

다중 선형 모형에서 식별된 다중 이상점과 다중 지렛점의 재확인 방법에 대한 연구 (A Confirmation of Identified Multiple Outliers and Leverage Points in Linear Model)

유종영;안기수
- 응용통계연구
- /
- 제15권2호
- /
- pp.269-279
- /
- 2002
다중 이상점 과 다중 지렛점의 식별은 가장효과(masking effect)와 편승효과(swamping effect)에 영향을 받으므로 어려움이 존재한다. Rousseeuw와 van Zomeren(1990)은 LMS (Least Median of Squares) 회귀방법과 MVE(Minimum Volume Ellipsoid) 통계량을 이용하여 다중 이상점과 다중 지렛점을 식별하였다. 그러나 이들의 방법은 LMS와 MVE의 강한 로버스트성으로 인하여 이상점과 지렛점이 아닌 점들도 이상점과 지렛점으로 식별하는 경향이 있다. Fung(1993)은 식별된 이상점과 지렛점들에 대하여 재확인방법을 제안하였는데 이 방법은 인근효과(adjacent effect)에 영향을 받아 이상점과 지렛점을 식별하는데 문제가 있는 것으로 분석되었다. 본 논문은 이러한 문제점을 지적하고 새로운 방법을 제안하여 식별된 이상점과 지렛점을 재확인하고자 한다.
https://doi.org/10.5351/KJAS.2002.15.2.269 인용 PDF KSCI

MULTIPLE OUTLIER DETECTION IN LOGISTIC REGRESSION BY USING INFLUENCE MATRIX

Lee, Gwi-Hyun;Park, Sung-Hyun
- Journal of the Korean Statistical Society
- /
- 제36권4호
- /
- pp.457-469
- /
- 2007
Many procedures are available to identify a single outlier or an isolated influential point in linear regression and logistic regression. But the detection of influential points or multiple outliers is more difficult, owing to masking and swamping problems. The multiple outlier detection methods for logistic regression have not been studied from the points of direct procedure yet. In this paper we consider the direct methods for logistic regression by extending the $Pe\tilde{n}a$ and Yohai (1995) influence matrix algorithm. We define the influence matrix in logistic regression by using Cook's distance in logistic regression, and test multiple outliers by using the mean shift model. To show accuracy of the proposed multiple outlier detection algorithm, we simulate artificial data including multiple outliers with masking and swamping.
PDF KSCI

Detecting Multiple Outliers Using the Gaps of Order Statistics

Kim, Hyun Chul
- Communications for Statistical Applications and Methods
- /
- 제2권2호
- /
- pp.184-197
- /
- 1995
An objective and one-step detection procedure of multiple outliers is suggested by using the gaps of the order statistics. The detection procedure can be used as a routine outlier detection method of a statistical analysis computer program. The procedure is applied to some examples including the data selected by Kitagawa.
PDF

The Identification Of Multiple Outliers

Park, Jin-Pyo
- Journal of the Korean Data and Information Science Society
- /
- 제11권2호
- /
- pp.201-215
- /
- 2000
The classical method for regression analysis is the least squares method. However, if the data contain significant outliers, the least squares estimator can be broken down by outliers. To remedy this problem, the robust methods are important complement to the least squares method. Robust methods down weighs or completely ignore the outliers. This is not always best because the outliers can contain some very important information about the population. If they can be detected, the outliers can be further inspected and appropriate action can be taken based on the results. In this paper, I propose a sequential outlier test to identify outliers. It is based on the nonrobust estimate and the robust estimate of scatter of a robust regression residuals and is applied in forward procedure, removing the most extreme data at each step, until the test fails to detect outliers. Unlike other forward procedures, the present one is unaffected by swamping or masking effects because the statistics is based on the robust regression residuals. I show the asymptotic distribution of the test statistics and apply the test to several real data and simulated data for the test to be shown to perform fairly well.
PDF

Procedures for Detecting Multiple Outliers in Linear Regression Using R

Kwon, Soon-Sun;Lee, Gwi-Hyun;Park, Sung-Hyun
- 한국통계학회:학술대회논문집
- /
- 한국통계학회 2005년도 추계 학술발표회 논문집
- /
- pp.13-17
- /
- 2005
In recent years, many people use R as a statistics system. R is frequently updated by many R project teams. We are interested in the method of multiple outlier detection and know that R is not supplied the method of multiple outlier detection. In this talk, we review these procedures for detecting multiple outliers and provide more efficient procedures combined with direct methods and indirect methods using R.
PDF

Multiple Deletions in Logistic Regression Models

Jung, Kang-Mo
- Communications for Statistical Applications and Methods
- /
- 제16권2호
- /
- pp.309-315
- /
- 2009
We extended the results of Roy and Guria (2008) to multiple deletions in logistic regression models. Since single deletions may not exactly detect outliers or influential observations due to swamping effects and masking effects, it needs multiple deletions. We developed conditional deletion diagnostics which are designed to overcome problems of masking effects. We derived the closed forms for several statistics in logistic regression models. They give useful diagnostics on the statistics.
https://doi.org/10.5351/CKSS.2009.16.2.309 인용 PDF KSCI

DETECTION OF OUTLIERS IN WEIGHTED LEAST SQUARES REGRESSION

Shon, Bang-Yong;Kim, Guk-Boh
- Journal of applied mathematics & informatics
- /
- 제4권2호
- /
- pp.501-512
- /
- 1997
In multiple linear regression model we have presupposed assumptions (independence normality variance homogeneity and so on) on error term. When case weights are given because of variance heterogeneity we can estimate efficiently regression parameter using weighted least squares estimator. Unfortunately this estimator is sen-sitive to outliers like ordinary least squares estimator. Thus in this paper we proposed some statistics for detection of outliers in weighted least squares regression.

검색결과 79건 처리시간 0.016초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)