• 제목/요약/키워드: Variable Statistics

검색결과 1,333건 처리시간 0.026초

Variable Selection Theorems in General Linear Model

  • Yoon, Sang-Hoo;Park, Jeong-Soo
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2005년도 추계 학술발표회 논문집
    • /
    • pp.187-192
    • /
    • 2005
  • For the problem of variable selection in linear models, we consider the errors are correlated with V covariance matrix. Hocking's theorems on the effects of the overfitting and the undefitting in linear model are extended to the less than full rank and correlated error model, and to the ANCOVA model

  • PDF

Bayesian Parameter :Estimation and Variable Selection in Random Effects Generalised Linear Models for Count Data

  • Oh, Man-Suk;Park, Tae-Sung
    • Journal of the Korean Statistical Society
    • /
    • 제31권1호
    • /
    • pp.93-107
    • /
    • 2002
  • Random effects generalised linear models are useful for analysing clustered count data in which responses are usually correlated. We propose a Bayesian approach to parameter estimation and variable selection in random effects generalised linear models for count data. A simple Gibbs sampling algorithm for parameter estimation is presented and a simple and efficient variable selection is done by using the Gibbs outputs. An illustrative example is provided.

대용변수를 이용한 가변형 부분군 크기 ${\bar{X}}$ 관리도의 경제적 설계 (Economic Design of Variable Sample Size ${\bar{X}}$ Control Chart Using a Surrogate Variable)

  • 이태훈;이민구;권혁무;홍성훈;이주호
    • 품질경영학회지
    • /
    • 제45권4호
    • /
    • pp.943-956
    • /
    • 2017
  • Purpose: This paper proposes a VSS(Variable Sample Size) ${\bar{X}}$ control chart using surrogate variable and shows its effectiveness compared with FSS(Fixed Sample Size) ${\bar{X}}$ control chart using either performance variable or surrogate variable. Methods: The expected cost function of VSS ${\bar{X}}$ control chart is derived. The optimal designs are then found for numerical examples using a GA(genetic algorithm) and compared to those of the FSS ${\bar{X}}$ control charts. Results: Computational results show that VSS ${\bar{X}}$ control chart using surrogate variables is superior to FSS ${\bar{X}}$ control chart using either performance variable or surrogate variable from the economic view points. Conclusion: The proposed VSS ${\bar{X}}$ control chart will be useful in industry fields where a performance variable is not avaliable or too costly.

가변모수를 갖는 EWMA 관리도 (EWMA Control Charts with Variable Parameter)

  • 이재헌;한정희
    • 품질경영학회지
    • /
    • 제33권4호
    • /
    • pp.117-122
    • /
    • 2005
  • Variable sampling rate(VSR) scheme varies the sampling rate for the current sample depending on the previous value of the control statistic. In this paper, we propose EWMA control charts with variable parameter(VP) scheme, which allows both the sample rate(the sample size or the sampling interval) and the weight to vary. We investigate the effectiveness of the VP scheme relative to the fixed parameter(FP) scheme and the VSR scheme in EWMA control charts. It is shown that using the VP scheme gives some improvements to the ability in detecting small and moderate shifts in the process normal mean.

A Variable Selection Procedure for K-Means Clustering

  • Kim, Sung-Soo
    • 응용통계연구
    • /
    • 제25권3호
    • /
    • pp.471-483
    • /
    • 2012
  • One of the most important problems in cluster analysis is the selection of variables that truly define cluster structure, while eliminating noisy variables that mask such structure. Brusco and Cradit (2001) present VS-KM(variable-selection heuristic for K-means clustering) procedure for selecting true variables for K-means clustering based on adjusted Rand index. This procedure starts with the fixed number of clusters in K-means and adds variables sequentially based on an adjusted Rand index. This paper presents an updated procedure combining the VS-KM with the automated K-means procedure provided by Kim (2009). This automated variable selection procedure for K-means clustering calculates the cluster number and initial cluster center whenever new variable is added and adds a variable based on adjusted Rand index. Simulation result indicates that the proposed procedure is very effective at selecting true variables and at eliminating noisy variables. Implemented program using R can be obtained on the website "http://faculty.knou.ac.kr/sskim/nvarkm.r and vnvarkm.r".

Design of the Variable Sampling Rates X-chart with Average Time to Signal Adjusted by the Sampling Cost

  • Park, Chang-Soon;Song, Moon-Sup
    • Journal of the Korean Statistical Society
    • /
    • 제26권2호
    • /
    • pp.181-198
    • /
    • 1997
  • The variable sampling rates scheme is proposed by taking random sample size and sampling interval during the process. The performance of the scheme is measured in terms of the average time to signal adjusted by teh sampling cost when the process is out of control. This measurement evaluates the effectiveness of the scheme in terms of the cost incurred due to nonconformation as well as sampling. The variable sampling rates scheme is shown to be effective especially for small and moderate shifts of the mean when compared to the standard scheme.

  • PDF

선형회귀에서 변수선택, 변수변환과 이상치 탐지의 동시적 수행을 위한 절차 (A procedure for simultaneous variable selection, variable transformation and outlier identification in linear regression)

  • 서한손;윤민
    • 응용통계연구
    • /
    • 제33권1호
    • /
    • pp.1-10
    • /
    • 2020
  • 본 연구에서는 선형회귀모형에서 이상치와 변수변환을 고려한 변수선택 알고리즘을 다룬다. 제안된 방법은 잠재적 이상치를 탐지하여 제거한 후 변수변환 추정을 위해 최소 절사 제곱 추정법을 적용하며 가능한 모든 회귀모형을 비교하여 최종적으로 변수를 선택한다. 정확한 변수 선택과 추정된 모델의 적합도의 맥락에서 방법의 효율성을 보여주기 위해 실제 데이터 분석 및 시뮬레이션 결과가 제시된다.

대용변수를 이용한 가변형 부분군 채취 간격 X 관리도의 경제적 설계 (Economic Design of Variable Sampling Interval X Control Chart Using a Surrogate Variable)

  • 이태훈;이주호;이민구
    • 대한산업공학회지
    • /
    • 제39권5호
    • /
    • pp.422-428
    • /
    • 2013
  • In many cases, an $\bar{X}$ control chart which is based on the performance variable is used in industrial fields. However, if the performance variable is too costly or impossible to measure and a less expensive surrogate variable is available, the process may be more efficiently controlled using surrogate variables. In this paper, we propose a model for the economic design of a VSI (Variable Sampling Interval) $\bar{X}$ control chart using a surrogate variable that is linearly correlated with the performance variable. The total average profit model is constructed, which involves the profit per cycle time, the cost of sampling and testing, the cost of detecting and eliminating an assignable cause, and the cost associated with production during out-of-control state. The VSI $\bar{X}$ control charts using surrogate variables are expected to be superior to the Shewhart FSI (Fixed Sampling Interval) $\bar{X}$ control charts using surrogate variables with respect to the expected profit per unit cycle time from economic viewpoint.

수정 결정계수를 사용한 로지스틱 회귀모형에서의 변수선택법 (Variable Selection for Logistic Regression Model Using Adjusted Coefficients of Determination)

  • 홍종선;함주형;김호일
    • 응용통계연구
    • /
    • 제18권2호
    • /
    • pp.435-443
    • /
    • 2005
  • 로지스틱 회귀모형에서 결정계수는 선형 회귀모형보다 다양하게 정의되며 그 값들도 매우 작아 로지스틱 회귀모형 평가기준으로 사용되는 통계량이 라고 할 수 없다. Liao와 McGee(2003)는 부적절한 설명변수의 추가 또는 표본크기의 변화에 민감하지 않은 두 종류의 수정 결정계수를 제안하였다. 본 연구에서는 실제자료에 적용한 로지스틱 회귀모형에서 수정 결정계수를 포함한 네 종류의 결정계수들을 변수선택의 기준으로 사용하여 기존의 변수선택 방법인 전진선택, 후진제거, 단계적 선택방법, AIC 통계량 등을 사용한 방법들과 비교하여 그 적절함과 효율성을 토론한다.

Learning fair prediction models with an imputed sensitive variable: Empirical studies

  • Kim, Yongdai;Jeong, Hwichang
    • Communications for Statistical Applications and Methods
    • /
    • 제29권2호
    • /
    • pp.251-261
    • /
    • 2022
  • As AI has a wide range of influence on human social life, issues of transparency and ethics of AI are emerging. In particular, it is widely known that due to the existence of historical bias in data against ethics or regulatory frameworks for fairness, trained AI models based on such biased data could also impose bias or unfairness against a certain sensitive group (e.g., non-white, women). Demographic disparities due to AI, which refer to socially unacceptable bias that an AI model favors certain groups (e.g., white, men) over other groups (e.g., black, women), have been observed frequently in many applications of AI and many studies have been done recently to develop AI algorithms which remove or alleviate such demographic disparities in trained AI models. In this paper, we consider a problem of using the information in the sensitive variable for fair prediction when using the sensitive variable as a part of input variables is prohibitive by laws or regulations to avoid unfairness. As a way of reflecting the information in the sensitive variable to prediction, we consider a two-stage procedure. First, the sensitive variable is fully included in the learning phase to have a prediction model depending on the sensitive variable, and then an imputed sensitive variable is used in the prediction phase. The aim of this paper is to evaluate this procedure by analyzing several benchmark datasets. We illustrate that using an imputed sensitive variable is helpful to improve prediction accuracies without hampering the degree of fairness much.