• 제목/요약/키워드: Variable Bias

검색결과 238건 처리시간 0.03초

A Study on the Bias Reduction in Split Variable Selection in CART

  • Song, Hyo-Im;Song, Eun-Tae;Song, Moon Sup
    • Communications for Statistical Applications and Methods
    • /
    • 제11권3호
    • /
    • pp.553-562
    • /
    • 2004
  • In this short communication we discuss the bias problems of CART in split variable selection and suggest a method to reduce the variable selection bias. Penalties proportional to the number of categories or distinct values are applied to the splitting criteria of CART. The results of empirical comparisons show that the proposed modification of CART reduces the bias in variable selection.

Learning fair prediction models with an imputed sensitive variable: Empirical studies

  • Kim, Yongdai;Jeong, Hwichang
    • Communications for Statistical Applications and Methods
    • /
    • 제29권2호
    • /
    • pp.251-261
    • /
    • 2022
  • As AI has a wide range of influence on human social life, issues of transparency and ethics of AI are emerging. In particular, it is widely known that due to the existence of historical bias in data against ethics or regulatory frameworks for fairness, trained AI models based on such biased data could also impose bias or unfairness against a certain sensitive group (e.g., non-white, women). Demographic disparities due to AI, which refer to socially unacceptable bias that an AI model favors certain groups (e.g., white, men) over other groups (e.g., black, women), have been observed frequently in many applications of AI and many studies have been done recently to develop AI algorithms which remove or alleviate such demographic disparities in trained AI models. In this paper, we consider a problem of using the information in the sensitive variable for fair prediction when using the sensitive variable as a part of input variables is prohibitive by laws or regulations to avoid unfairness. As a way of reflecting the information in the sensitive variable to prediction, we consider a two-stage procedure. First, the sensitive variable is fully included in the learning phase to have a prediction model depending on the sensitive variable, and then an imputed sensitive variable is used in the prediction phase. The aim of this paper is to evaluate this procedure by analyzing several benchmark datasets. We illustrate that using an imputed sensitive variable is helpful to improve prediction accuracies without hampering the degree of fairness much.

Bias Reduction in Split Variable Selection in C4.5

  • Shin, Sung-Chul;Jeong, Yeon-Joo;Song, Moon Sup
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.627-635
    • /
    • 2003
  • In this short communication we discuss the bias problem of C4.5 in split variable selection and suggest a method to reduce the variable selection bias among categorical predictor variables. A penalty proportional to the number of categories is applied to the splitting criterion gain of C4.5. The results of empirical comparisons show that the proposed modification of C4.5 reduces the size of classification trees.

고효율 전력증폭기 설계를 위한 가변 바이어스 기법 (Variable Bias Techniques for High Efficiency Power Amplifier Design)

  • 이영민;김경민;구경헌
    • 한국항행학회논문지
    • /
    • 제13권3호
    • /
    • pp.358-364
    • /
    • 2009
  • 본 논문에서는 설계된 전력증폭기에서 가변 바이어스 기법을 이용하면 전력부가효율을 증가시킬 수 있다는 것을 보였다. 서로 다른 출력전력을 갖는 이중 모우드에서 높은 효율을 얻기 위하여 가변 바이어스 기법을 이용하고 바이어스 변화에 따른 영향을 시뮬레이션 하였다. 게이트 전압을 고정하고 드레인 바이어스를 시뮬레이션으로 최적값을 구하여 이를 변화하여 전력증폭기의 효율을 향상시킬 수 있었다. 또한 전력증폭기의 비선형 특성을 분석하고 디지털 사전왜곡 기법을 이용하여 이중 대역 증폭기의 송신기의 ACPR 특성을 최대 10dB 개선되었다.

  • PDF

변수선택 편향이 없는 회귀나무를 만들기 위한 알고리즘 (Regression Trees with. Unbiased Variable Selection)

  • 김진흠;김민호
    • 응용통계연구
    • /
    • 제17권3호
    • /
    • pp.459-473
    • /
    • 2004
  • 본 논문에서는 Breiman 등(1984)의 전체탐색법이 갖고 있는 변수선택 편향을 극복할 수 있는 알고리즘을 제안하였다. 제안한 알고리즘은 노드의 분리 변수를 선택하는 단계와 그 선택된 변수에 대해서만 이진분리를 위한 분리점을 찾는 단계로 나뉘어져 있다. 예측변수가 연속형 일 때는 스피어만의 순위상관계수에 의한 검정을 수행하고, 범주형일 때는 크루스칼-왈리스의 통계량에 의한 검정을 수행하여 통계적으로 가장 유의한 변수를 분리변수로 선택하였고 Breiman 등(1984)의 전체탐색법을 그 변수에만 적용하여 노드의 분리기준을 정하였다 모의실험 연구를 통해 Breiman등(19히)의 CART와 제안한 알고리즘을 변수선택 편의, 변수선택력파 평균제곱오차 측면에서 서로 비교하였다. 아울러 두 알고리즘을 실제 자료에 적용하여 효율을 서로 비교하였다.

성인의 건강위기에 대한 낙관적 편견과 건강행위 간의 관계 (The Relationship between Optimistic Bias about Health Crisis and Health Behavior)

  • 박수호;이설희;함은미
    • 대한간호학회지
    • /
    • 제38권3호
    • /
    • pp.403-409
    • /
    • 2008
  • Purpose: This study was performed to identify the relationship between optimistic bias about health crisis and health behavior of Korean adults in a crisis of health, and to prepare baseline data for developing a health education and promotion program. Methods: Study subjects were 595 aged from 19 to 64 who live in Korea. Data were collected through questionnaires administered by one interviewer. Descriptive statistics and Pearson's correlation coefficient were calculated using the SPSS program. Results: The average score for optimistic bias about health crisis was 2.69, and that for health behavior was 107.05. The optimistic bias about health crisis showed a significantly positive correlation with health behavior (r=.187, p=.000). Conclusion: To make our results more useful, it is necessary to identity the causal relationship between health attitudes as an explanatory variable and optimistic bias as an outcome variable. In addition, a relatively low score in optimistic bias from this research compared to other studies must be explained through further studies considering unique Korean cultural background. Moreover, research of the relationship between optimistic bias about health crisis and health behavior looking at people who don't have good health behaviors is needed.

Impact of Diverse Configuration in Multivariate Bias Correction Methods on Large-Scale Climate Variable Simulations under Climate Change

  • de Padua, Victor Mikael N.;Ahn Kuk-Hyun
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2023년도 학술발표회
    • /
    • pp.161-161
    • /
    • 2023
  • Bias correction of values is a necessary step in downscaling coarse and systematically biased global climate models for use in local climate change impact studies. In addition to univariate bias correction methods, many multivariate methods which correct multiple variables jointly - each with their own mathematical designs - have been developed recently. While some literature have focused on the inter-comparison of these multivariate bias correction methods, none have focused extensively on the effect of diverse configurations (i.e., different combinations of input variables to be corrected) of climate variables, particularly high-dimensional ones, on the ability of the different methods to remove biases in uni- and multivariate statistics. This study evaluates the impact of three configurations (inter-variable, inter-spatial, and full dimensional dependence configurations) on four state-of-the-art multivariate bias correction methods in a national-scale domain over South Korea using a gridded approach. An inter-comparison framework evaluating the performance of the different combinations of configurations and bias correction methods in adjusting various climate variable statistics was created. Precipitation, maximum, and minimum temperatures were corrected across 306 high-resolution (0.2°) grid cells and were evaluated. Results show improvements in most methods in correcting various statistics when implementing high-dimensional configurations. However, some instabilities were observed, likely tied to the mathematical designs of the methods, informing that some multivariate bias correction methods are incompatible with high-dimensional configurations highlighting the potential for further improvements in the field, as well as the importance of proper selection of the correction method specific to the needs of the user.

  • PDF

A Study on Unbiased Methods in Constructing Classification Trees

  • Lee, Yoon-Mo;Song, Moon Sup
    • Communications for Statistical Applications and Methods
    • /
    • 제9권3호
    • /
    • pp.809-824
    • /
    • 2002
  • we propose two methods which separate the variable selection step and the split-point selection step. We call these two algorithms as CHITES method and F&CHITES method. They adapted some of the best characteristics of CART, CHAID, and QUEST. In the first step the variable, which is most significant to predict the target class values, is selected. In the second step, the exhaustive search method is applied to find the splitting point based on the selected variable in the first step. We compared the proposed methods, CART, and QUEST in terms of variable selection bias and power, error rates, and training times. The proposed methods are not only unbiased in the null case, but also powerful for selecting correct variables in non-null cases.

의사결정나무에서 분리 변수 선택에 관한 연구 (A Study on Selection of Split Variable in Constructing Classification Tree)

  • 정성석;김순영;임한필
    • 응용통계연구
    • /
    • 제17권2호
    • /
    • pp.347-357
    • /
    • 2004
  • 의사결정나무에서 분리 변수를 선택하는 것은 매우 중요한 일이다. C4.5는 변수 선택에 있어 연속형 변수로의 변수 선택 편의가 심각하고, QUEST는 연속형 변수와 관련해서 정규성 가정이 위반될 경우 변수 선택력이 떨어진다. 본 논문에서는 통계적 로버스트 검정 알고리즘을 제안하고, 모의 실험을 통하여 C4.5, QUEST그러고 제안된 알고리즘의 효율성을 비교하였다. 실험 결과 제안된 알고리즘이 변수 선택 편의와 변수 선택력 측면에서 로버스트함을 알 수 있었다.

A GHz-Level RSFQ Clock Distribution Technique with Bias Current Control in JTLs

  • Cho W.;Lim J.H.;Moon G.
    • 한국초전도ㆍ저온공학회논문지
    • /
    • 제8권2호
    • /
    • pp.17-19
    • /
    • 2006
  • A novel clock distribution technique for pipelined-RSFQ logics using variable Bias Currents of JTLs as delay-medium is newly proposed. RSFQ logics consist of several logic gates or blocks connected in a pipeline structure. And each block has variable delay difference. In the structure, this clock distribution method generates a set of clock signals for each logic blocks with suitable corresponding delays. These delays, in the order of few to tens of pS, can be adjusted through controlling bias current of JTL of delay medium. While delays with resistor value and JJ size are fixed at fabrication stage, delay through bias current can be controlled externally, and thus, is heavily investigated for its range as well as correct operation within current margin. Possible ways of a standard delay library with modular structure are sought for further modularizing Pipelined-RSFQ applications. Simulations and verifications are done through WRSpice with Hypres 3-um process parameters.