• 제목/요약/키워드: Multivariate Data

검색결과 2,016건 처리시간 0.028초

Principles of Multivariate Data Visualization

  • Huh, Moon Yul;Cha, Woon Ock
    • Communications for Statistical Applications and Methods
    • /
    • 제11권3호
    • /
    • pp.465-474
    • /
    • 2004
  • Data visualization is the automation process and the discovery process to data sets in an effort to discover underlying information from the data. It provides rich visual depictions of the data. It has distinct advantages over traditional data analysis techniques such as exploring the structure of large scale data set both in the sense of number of observations and the number of variables by allowing great interaction with the data and end-user. We discuss the principles of data visualization and evaluate the characteristics of various tools of visualization according to these principles.

다변수 분석법에 의한 조선시대 동전의 분류연구 (Multivariate Classification of Choson Coins)

  • 이창근;강형태;고성희
    • 보존과학연구
    • /
    • 통권8호
    • /
    • pp.1-12
    • /
    • 1987
  • Fifty ancient Korean coins originated in Choson dynasty have been determined for 9 elements such as Sn, Fe, As, Ag, Co, Sb, Ir, Ru and Ni by instrumental neutron activation analysis and for 3 elements such as Cu, Pb, and Zn by atomicalsorption spectrometry. Bronze coins originated in early days of the dynasty contain as major constituents Cu, Pb and Sn approximately in the ratio 90 : 4 : 3, where as, those in latter days contain in the ratio 7 : 2 : 0. Brass coins which had begun in 17century contain as major constituents Cu, Zn and Pb approximately in the ratio 7 : 1: 1. The multivariate date have been analyzed for the relation among elemental contents through the variance-covariance matrix. The data have been fur theranalyzed by a principal component mapping method. As the results training set of 8class have been chosen, based on the spread of sample points in an eigenvector plotand archaeolgical data such as age and the office of minting.

  • PDF

PROCESS ANALYSIS OF AUTOMOTIVE PARTS USING GRAPHICAL MODELLING

  • IRIKURA Norio;KUZUYA Kazuyoshi;NISHINA Ken
    • 한국품질경영학회:학술대회논문집
    • /
    • 한국품질경영학회 1998년도 The 12th Asia Quality Management Symposium* Total Quality Management for Restoring Competitiveness
    • /
    • pp.295-300
    • /
    • 1998
  • Recently graphical modelling is being studied as a useful process analysis tool for exploratory causal analysis. Graphical modelling is a presentation method that uses graphs to describe statistical models of the structures of multivariate data. This paper describes an application of this graphical modeling with two cases from the automotive parts industry. One case is the unbalance problem of the pulley, an automotive generator part. There is multivariate data of the product from each of the processes which are connected in the series. By means of exploratory causal analysis between the variables using graphical modeling, the key processes which causes the variation of the final characteristics and their mechanism of the causal relationship have become clear. Another case is, also, the unbalanced problem of automotive starter parts which consists of many parts and is manufactured by complex machinery and assembling process. By means of the similar technique, the key processes are obtained easily and the results are reasonable from technical knowledge.

  • PDF

Multioutput LS-SVR based residual MCUSUM control chart for autocorrelated process

  • Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권2호
    • /
    • pp.523-530
    • /
    • 2016
  • Most classical control charts assume that processes are serially independent, and autocorrelation among variables makes them unreliable. To address this issue, a variety of statistical approaches has been employed to estimate the serial structure of the process. In this paper, we propose a multioutput least squares support vector regression and apply it to construct a residual multivariate cumulative sum control chart for detecting changes in the process mean vector. Numerical studies demonstrate that the proposed multioutput least squares support vector regression based control chart provides more satisfying results in detecting small shifts in the process mean vector.

Simple Compromise Strategies in Multivariate Stratification

  • Park, Inho
    • Communications for Statistical Applications and Methods
    • /
    • 제20권2호
    • /
    • pp.97-105
    • /
    • 2013
  • Stratification (among other applications) is a popular technique used in survey practice to improve the accuracy of estimators. Its full potential benefit can be gained by the effective use of auxiliary variables in stratification related to survey variables. This paper focuses on the problem of stratum formation when multiple stratification variables are available. We first review a variance reduction strategy in the case of univariate stratification. We then discuss its use for multivariate situations in convenient and efficient ways using three methods: compromised measures of size, principal components analysis and a K-means clustering algorithm. We also consider three types of compromising factors to data when using these three methods. Finally, we compare their efficiency using data from MU281 Swedish municipality population.

Order-Restricted Inference with Linear Rank Statistics in Microarray Data

  • Kang, Moon-Su
    • 응용통계연구
    • /
    • 제24권1호
    • /
    • pp.137-143
    • /
    • 2011
  • The classification of subjects with unknown distribution in a small sample size often involves order-restricted constraints in multivariate parameter setups. Those problems make the optimality of a conventional likelihood ratio based statistical inferences not feasible. Fortunately, Roy (1953) introduced union-intersection principle(UIP) which provides an alternative avenue. Multivariate linear rank statistics along with that principle, yield a considerably appropriate robust testing procedure. Furthermore, conditionally distribution-free test based upon exact permutation theory is used to generate p-values, even in a small sample. Applications of this method are illustrated in a real microarray data example (Lobenhofer et al., 2002).

회귀나무를 이용한 무응답 가중치 조정 (Unit Nonresponse Weighting Adjustment Using Regression Tree)

  • 김세미;이석훈
    • 한국조사연구학회:학술대회논문집
    • /
    • 한국조사연구학회 2005년도 추계학술대회 발표논문집
    • /
    • pp.169-183
    • /
    • 2005
  • 가중치 조정(weighting adjustment)으로 단위 무응답(unit nonresponse)을 처리하는 문제에서 성향점수를 추정하는 모형을 만들기 위해 응답변수와 관심변수를 동시에 고려하는 다변량 회귀나무(multivariate regression tree)기법을 제안하였다. 효과적인 무응답 조정층 구축을 위해 응답한 개체들만 사용하는 경우와 모든 개체들을 사용하는 경우를 제시하고 이 두방법을 편향의 관점으로 비교한다.

  • PDF

Racial and Social Economic Factors Impact on the Cause Specific Survival of Pancreatic Cancer: A SEER Survey

  • Cheung, Rex
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제14권1호
    • /
    • pp.159-163
    • /
    • 2013
  • Background: This study used Surveillance, Epidemiology and End Results (SEER) pancreatic cancer data to identify predictive models and potential socio-economic disparities in pancreatic cancer outcome. Materials and Methods: For risk modeling, Kaplan Meier method was used for cause specific survival analysis. The Kolmogorov-Smirnov's test was used to compare survival curves. The Cox proportional hazard method was applied for multivariate analysis. The area under the ROC curve was computed for predictors of absolute risk of death, optimized to improve efficiency. Results: This study included 58,747 patients. The mean follow up time (S.D.) was 7.6 (10.6) months. SEER stage and grade were strongly predictive univariates. Sex, race, and three socio-economic factors (county level family income, rural-urban residence status, and county level education attainment) were independent multivariate predictors. Racial and socio-economic factors were associated with about 2% difference in absolute cause specific survival. Conclusions: This study s found significant effects of socio-economic factors on pancreas cancer outcome. These data may generate hypotheses for trials to eliminate these outcome disparities.

성인의 건강행태와 근골격계질환과의 관련요인 (Factors Associated with Health Behaviors and Musculoskeletal Disease among Adults)

  • 변기진;홍해숙;김윤경
    • Journal of Korean Biological Nursing Science
    • /
    • 제13권3호
    • /
    • pp.262-268
    • /
    • 2011
  • Purpose: The purpose of this study was to identify the relationship between health behaviors and musculoskeletal disease in adults. Methods: The data of 7,421 adults applied in this study were collected from health behaviors (smoking, drinking, exercise, weight, sleeping, stress) and related with musculoskeletal disease form 2009 Korea National Health and Nutrition Examination Survey. Date were collected through self-report questionnaires from January to December, 2009. Data were analyzed using chi-square test, Multivariate logistic regression with SPSS/WIN 17.0. Results: The prevalence rate of musculoskeletal disease was 30.6%. In multivariate analyses, the sex, age, BMI, stress and education were statistically strongly associated with most of musculoskeletal disease among adults. Conclusion: The results of this study suggest that musculoskeletal disease prevention and nursing intervention programs should be necessarily established and continuously managed in order to treat with the musculoskeletal disease.

Unmasking Multiple Outliers in Multivariate Data

  • Yoo Jong-Young
    • Communications for Statistical Applications and Methods
    • /
    • 제13권1호
    • /
    • pp.29-38
    • /
    • 2006
  • We proposed a procedure for detecting of multiple outliers in multivariate data. Rousseeuw and van Zomeren (1990) have suggested the robust distance $RD_i$ by using the Resampling Algorithm. But $RD_i$ are based on the assumption that X is in the general position.(X is said to be in the general position when every subsample of size p+1 has rank p) From the practical points of view, this is clearly unrealistic. In this paper, we proposed a computing method for approximating MVE, which is not subject to these problems. The procedure is easy to compute, and works well even if subsample is singular or nearly singular matrix.