한국경영과학회:학술대회논문집 (Proceedings of the Korean Operations and Management Science Society Conference)
- 대한산업공학회/한국경영과학회 2000년도 춘계공동학술대회 논문집
- /
- Pages.601-604
- /
- 2000
다구찌 디자인을 이용한 데이터 퓨전 및 군집분석 분류 성능 비교
Comparison Study for Data Fusion and Clustering Classification Performances
초록
In this paper, we compare the classification performance of both data fusion and clustering algorithms (Data Bagging, Variable Selection Bagging, Parameter Combining, Clustering) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are (1) correlation among input variables (2) variance of observation (3) training data size and (4) input-output function. Since the relationship between input & output is not typically known, we use Taguchi design to improve the practicality of our study results by letting it as a noise factor. Experimental study results indicate the following: Clustering based logistic regression turns out to provide the highest classification accuracy when input variables are weakly correlated and the variance of data is high. When there is high correlation among input variables, variable bagging performs better than logistic regression. When there is strong correlation among input variables and high variance between observations, bagging appears to be marginally better than logistic regression but was not significant.
키워드