• Title/Summary/Keyword: Multivariate survey

Search Result 509, Processing Time 0.029 seconds

Statistical Methods for Multivariate Missing Data in Health Survey Research (보건조사연구에서 다변량결측치가 내포된 자료를 효율적으로 분석하기 위한 통계학적 방법)

  • Kim, Dong-Kee;Park, Eun-Cheol;Sohn, Myong-Sei;Kim, Han-Joong;Park, Hyung-Uk;Ahn, Chae-Hyung;Lim, Jong-Gun;Song, Ki-Jun
    • Journal of Preventive Medicine and Public Health
    • /
    • v.31 no.4 s.63
    • /
    • pp.875-884
    • /
    • 1998
  • Missing observations are common in medical research and health survey research. Several statistical methods to handle the missing data problem have been proposed. The EM algorithm (Expectation-Maximization algorithm) is one of the ways of efficiently handling the missing data problem based on sufficient statistics. In this paper, we developed statistical models and methods for survey data with multivariate missing observations. Especially, we adopted the EM algorithm to handle the multivariate missing observations. We assume that the multivariate observations follow a multivariate normal distribution, where the mean vector and the covariance matrix are primarily of interest. We applied the proposed statistical method to analyze data from a health survey. The data set we used came from a physician survey on Resource-Based Relative Value Scale(RBRVS). In addition to the EM algorithm, we applied the complete case analysis, which uses only completely observed cases, and the available case analysis, which utilizes all available information. The residual and normal probability plots were evaluated to access the assumption of normality. We found that the residual sum of squares from the EM algorithm was smaller than those of the complete-case and the available-case analyses.

  • PDF

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.

Complex sample design effects and inference for Korea National Health and Nutrition Examination Survey data (국민건강영양조사 자료의 복합표본설계효과와 통계적 추론)

  • Chung, Chin-Eun
    • Journal of Nutrition and Health
    • /
    • v.45 no.6
    • /
    • pp.600-612
    • /
    • 2012
  • Nutritional researchers world-wide are using large-scale sample survey methods to study nutritional health epidemiology and services utilization in general, non-clinical populations. This article provides a review of important statistical methods and software that apply to descriptive and multivariate analysis of data collected in sample surveys, such as national health and nutrition examination survey. A comparative data analysis of the Korea National Health and Nutrition Examination Survey (KNHANES) was used to illustrate analytical procedures and design effects for survey estimates of population statistics, model parameters, and test statistics. This article focused on the following points, method of approach to analyze of the sample survey data, right software tools available to perform these analyses, and correct survey analysis methods important to interpretation of survey data. It addresses the question of approaches to analysis of complex sample survey data. The latest developments in software tools for analysis of complex sample survey data are covered, and empirical examples are presented that illustrate the impact of survey sample design effects on the parameter estimates, test statistics, and significance probabilities (p values) for univariate and multivariate analyses.

GEOSTATISTICAL INTEGRATION OF HIGH-RESOLUTION REMOTE SENSING DATA IN SPATIAL ESTIMATION OF GRAIN SIZE

  • Park, No-Wook;Chi, Kwang-Hoon;Jang, Dong-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.1
    • /
    • pp.406-408
    • /
    • 2006
  • Various geological thematic maps such as grain size or ground water level maps have been generated by interpolating sparsely sampled ground survey data. When there are sampled data at a limited number of locations, to use secondary information which is correlated to primary variable can help us to estimate the attribute values of the primary variable at unsampled locations. This paper applies two multivariate geostatistical algorithms to integrate remote sensing imagery with sparsely sampled ground survey data for spatial estimation of grain size: simple kriging with local means and kriging with an external drift. High-resolution IKONOS imagery which is well correlated with the grain size is used as secondary information. The algorithms are evaluated from a case study with grain size observations measured at 53 locations in the Baramarae beach of Anmyeondo, Korea. Cross validation based on a one-leave-out approach is used to compare the estimation performance of the two multivariate geostatistical algorithms with that of traditional ordinary kriging.

  • PDF

A Survey on Unsupervised Anomaly Detection for Multivariate Time Series (다변량 시계열 이상 탐지 과업에서 비지도 학습 모델의 성능 비교)

  • Juwan Lim;Jaekoo Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.1
    • /
    • pp.1-12
    • /
    • 2023
  • It is very time-intensive to obtain data with labels on anomaly detection tasks for multivariate time series. Therefore, several studies have been conducted on unsupervised learning that does not require any labels. However, a well-done integrative survey has not been conducted on in-depth discussion of learning architecture and property for multivariate time series anomaly detection. This study aims to explore the characteristic of well-known architectures in anomaly detection of multivariate time series. Additionally, architecture was categorized by using top-down and bottom-up approaches. In order toconsider real-world anomaly detection situation, we trained models with dataset such as power grids or Cyber Physical Systems that contains realistic anomalies. From experimental results, we compared and analyzed the comprehensive performance of each architecture. Quantitative performance were measured using precision, recall, and F1 scores.

A Post-stratified Estimation in Multivariate Stratified Sampling Surveys

  • Park, Jinwoo
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.3
    • /
    • pp.755-760
    • /
    • 1999
  • In multivariate stratified sampling surveys it is general to use a few stratification variables which are highly correlated with the important variables at design stage. But there might be some secondary study variables which are not so highly correlated with those stratification variables. In that case it is not efficient to use the same type of estimator due to the secondary variables as the one base on the important variables. A post-stratified estimation is proposed to increase the efficiency of the estimator with existence of secondary variables. The proposed method is illustrated with a set of fishery household population survey data.

  • PDF

Multivariate Analysis of Covariance on Characteristics Influencing Technological and Managerial Barriers of Technology Startups

  • Geonil Ko;Namjae Cho
    • Journal of Information Technology Applications and Management
    • /
    • v.31 no.1
    • /
    • pp.27-43
    • /
    • 2024
  • This study investigated technological and managerial barriers in technology startups through a survey of 151 companies, yielding 118 responses (78.1% response rate). Factor and multivariate analyses identified two distinct barriers: technological and managerial. Reliability analysis validated the measurement tool. Using MANCOVA, 12 hypotheses were tested, incorporating six independent variables. Results revealed significant disparities in technological and managerial barriers based on establishment type, commercialization goals, growth stage, and commercialization stage, with 5 hypotheses supported. This study highlights the crucial role of these variables in understanding barriers within technology-based startups.

Simple Compromise Strategies in Multivariate Stratification

  • Park, Inho
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.2
    • /
    • pp.97-105
    • /
    • 2013
  • Stratification (among other applications) is a popular technique used in survey practice to improve the accuracy of estimators. Its full potential benefit can be gained by the effective use of auxiliary variables in stratification related to survey variables. This paper focuses on the problem of stratum formation when multiple stratification variables are available. We first review a variance reduction strategy in the case of univariate stratification. We then discuss its use for multivariate situations in convenient and efficient ways using three methods: compromised measures of size, principal components analysis and a K-means clustering algorithm. We also consider three types of compromising factors to data when using these three methods. Finally, we compare their efficiency using data from MU281 Swedish municipality population.

Unit Nonresponse Weighting Adjustment Using Regression Tree (회귀나무를 이용한 무응답 가중치 조정)

  • Kim, Se-Mi;Lee, Seok-Hun
    • Proceedings of the Korean Association for Survey Research Conference
    • /
    • 2005.12a
    • /
    • pp.169-183
    • /
    • 2005
  • This paper considers formation of nonresponse weighting adjustment cell for handling unit nonresponse in sample surveys. We propose a multivariate regression tree mehtod for segmentation using the variable of interest and the estimated response probability simultaneously to construct effective nonresponse adjustment cell. One is using only response data and the other is using response and nonresponse data. These two cases are compared in terms of bias.

  • PDF