• 제목/요약/키워드: outlying cell

검색결과 6건 처리시간 0.025초

Outlying Cell Identification Method Using Interaction Estimates of Log-linear Models

  • Hong, Chong Sun;Jung, Min Jung
    • Communications for Statistical Applications and Methods
    • /
    • 제10권2호
    • /
    • pp.291-303
    • /
    • 2003
  • This work is proposed an alternative identification method of outlying cell which is one of important issues in categorical data analysis. One finds that there is a strong relationship between the location of an outlying cell and the corresponding parameter estimates of the well-fitted log-linear model. Among parameters of log-linear model, an outlying cell is affected by interaction terms rather than main effect terms. Hence one could identify an outlying cell by investigating of parameter estimates in an appropriate log-linear model.

An Identification of Outlying Cells in Contingency Table via Correspondence Analysis Map

  • Hong, Chong Sun;Lee, Jong Cheol
    • Communications for Statistical Applications and Methods
    • /
    • 제8권1호
    • /
    • pp.39-49
    • /
    • 2001
  • When an appropriate model is fitted to explain a certain categorical data, outlying cell detection plays very important role to reduce the lack of fit. There exist many statistical methods to identify outlying cells in contingency table. In this paper, correspondence analysis is applied to identify one or two outlying cells. When corresponding relationships between categories of the row and columns are explored, we find that outlying cells could be identified via the correspondence analysis map.

  • PDF

Identification of Multiple Outlying Cells in Multi-way Tables

  • Lee, Jong Cheol;Hong, Chong Sun
    • Communications for Statistical Applications and Methods
    • /
    • 제7권3호
    • /
    • pp.687-698
    • /
    • 2000
  • An identification method is proposed in order to detect more than one outlying cells in multi-way contingency tables. The iterative proportional fitting method is applied to get expected values of several suspected outlying cells. Since the proposed method uses minimal sufficient statistics under quasi log-linear models, expected counts of outlying cells could be estimated under any hierarchical log-linear models. This method is an extension of the backwards-stepping method of Simonoff(1988) and requires les iteration to identify outlying cells.

  • PDF

범주형 자료분석을 위한 최대절사우도추정 (Maximum Trimmed Likelihood Estimator for Categorical Data Analysis)

  • 최현집
    • Communications for Statistical Applications and Methods
    • /
    • 제16권2호
    • /
    • pp.229-238
    • /
    • 2009
  • 범주형 자료분석을 위해 고려할 수 있는 모형들은 일반적으로 최우추정에 의하여 적합이 이루어지므로 이상값에 쉽게 영향을 받을 수 있다. 본 연구에서는 분할표 자료에 포함된 이상칸(outlying cell)에 영향을 받지 않는 최대 절삭우도 추정 값(maximum trimmed likelihood estimates)을 얻기 위한 추정 방법을 제안하였다. 제안된 방법은 우도에 의존하여 분할표에 포함된 칸을 제거해나가며 절사우도의 최대값을 찾기 때문에 완전탐색(complete enumeration)에 비해 계산의 양이 매우 적다. 따라서 일반적인 다차원 분할표 자료분석을 위해 쉽게 적용될 수 있다. 실제 자료분석 예를 통해 제안된 추정방법을 설명하였으며, 모의실험을 통해 문제점과 특징을 토론하였다.

범주형 자료의 진단방법에 관한 연구 (A Study on Diagnostics Method for Categorical Data)

  • 이선규;조범석
    • 산업경영시스템학회지
    • /
    • 제18권33호
    • /
    • pp.93-102
    • /
    • 1995
  • In this study we are concerned with the diagnostics method of cross-classified categorical data using logistic regression model of binary response models for cell proportions. under this model, we could examine the goodness-of-fit of the models using Pearson's $x^2$test statistic and likelihood ratio statistic. Under this model, these statistics are assumed that sample survey schemes are with replacement sampling model. But these statistics are often inappropriate for analysing contingency tables consists of complex sampling schemes obtained sample survey data. In this study we are examined diagnostics procedures detecting any outlying cell proportions and influential observations on design space in logistic regression modeltake account of the survey design effects.

  • PDF

Improving data reliability on oligonucleotide microarray

  • Yoon, Yeo-In;Lee, Young-Hak;Park, Jin-Hyun
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2004년도 The 3rd Annual Conference for The Korean Society for Bioinformatics Association of Asian Societies for Bioinformatics 2004 Symposium
    • /
    • pp.107-116
    • /
    • 2004
  • The advent of microarray technologies gives an opportunity to moni tor the expression of ten thousands of genes, simultaneously. Such microarray data can be deteriorated by experimental errors and image artifacts, which generate non-negligible outliers that are estimated by 15% of typical microarray data. Thus, it is an important issue to detect and correct the se faulty probes prior to high-level data analysis such as classification or clustering. In this paper, we propose a systematic procedure for the detection of faulty probes and its proper correction in Genechip array based on multivariate statistical approaches. Principal component analysis (PCA), one of the most widely used multivariate statistical approaches, has been applied to construct a statistical correlation model with 20 pairs of probes for each gene. And, the faulty probes are identified by inspecting the squared prediction error (SPE) of each probe from the PCA model. Then, the outlying probes are reconstructed by the iterative optimization approach minimizing SPE. We used the public data presented from the gene chip project of human fibroblast cell. Through the application study, the proposed approach showed good performance for probe correction without removing faulty probes, which may be desirable in the viewpoint of the maximum use of data information.

  • PDF