• Title/Summary/Keyword: Statistical matching

Search Result 274, Processing Time 0.019 seconds

Statistical micro matching using a multinomial logistic regression model for categorical data

  • Kim, Kangmin;Park, Mingue
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.5
    • /
    • pp.507-517
    • /
    • 2019
  • Statistical matching is a method of combining multiple sources of data that are extracted or surveyed from the same population. It can be used in situation when variables of interest are not jointly observed. It is a low-cost way to expect high-effects in terms of being able to create synthetic data using existing sources. In this paper, we propose the several statistical micro matching methods using a multinomial logistic regression model when all variables of interest are categorical or categorized ones, which is common in sample survey. Under conditional independence assumption (CIA), a mixed statistical matching method, which is useful when auxiliary information is not available, is proposed. We also propose a statistical matching method with auxiliary information that reduces the bias of the conventional matching methods suggested under CIA. Through a simulation study, proposed micro matching methods and conventional ones are compared. Simulation study shows that suggested matching methods outperform the existing ones especially when CIA does not hold.

On the Development of Probability Matching Priors for Non-regular Pareto Distribution

  • Lee, Woo Dong;Kang, Sang Gil;Cho, Jang Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.333-339
    • /
    • 2003
  • In this paper, we develop the probability matching priors for the parameters of non-regular Pareto distribution. We prove the propriety of joint posterior distribution induced by probability matching priors. Through the simulation study, we show that the proposed probability matching Prior matches the coverage probabilities in a frequentist sense. A real data example is given.

Statistical Matching Techniques Using the Robust Regression Model (로버스트 회귀모형을 이용한 자료결합방법)

  • Jhun, Myoung-Shic;Jung, Ji-Song;Park, Hye-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.6
    • /
    • pp.981-996
    • /
    • 2008
  • Statistical matching techniques whose aim is to achieve a complete data file from different sources. Since the statistical matching method proposed by Rubin (1986) assumes the multivariate normality for data, using this method to data which violates the assumption would involve some problems. This research proposed the statistical matching method using robust regression as an alternative to the linear regression. Furthermore, we carried out a simulation study to compare the performance of the robust regression model and the linear regression model for the statistical matching.

Image Description and Matching Scheme Using Synthetic Features for Recommendation Service

  • Yang, Won-Keun;Cho, A-Young;Oh, Weon-Geun;Jeong, Dong-Seok
    • ETRI Journal
    • /
    • v.33 no.4
    • /
    • pp.589-599
    • /
    • 2011
  • This paper presents an image description and matching scheme using synthetic features for a recommendation service. The recommendation service is an example of smart search because it offers something before a user's request. In the proposed extraction scheme, an image is described by synthesized spatial and statistical features. The spatial feature is designed to increase the discriminability by reflecting delicate variations. The statistical feature is designed to increase the robustness by absorbing small variations. For extracting spatial features, we partition the image into concentric circles and extract four characteristics using a spatial relation. To extract statistical features, we adapt three transforms into the image and compose a 3D histogram as the final statistical feature. The matching schemes are designed hierarchically using the proposed spatial and statistical features. The result shows that each feature is better than the compared algorithms that use spatial or statistical features. Additionally, if we adapt the proposed whole extraction and matching scheme, the overall performance will become 98.44% in terms of the correct search ratio.

Association Rule Mining by Environmental Data Fusion

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.279-287
    • /
    • 2007
  • Data fusion is the process of combining multiple data in order to produce information of tactical value to the user. Data fusion is generally defined as the use of techniques that combine data from multiple sources and gather that information in order to achieve inferences. Data fusion is also called data combination or data matching. Data fusion is divided in five branch types which are exact matching, judgemental matching, probability matching, statistical matching, and data linking. In this paper, we develop was macro program for statistical matching which is one of five branch types for data fusion. And then we apply data fusion and association rule techniques to environmental data.

  • PDF

A Robust Approach of Regression-Based Statistical Matching for Continuous Data

  • Sohn, Soon-Cheol;Jhun, Myoung-Shic
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.2
    • /
    • pp.331-339
    • /
    • 2012
  • Statistical matching is a methodology used to merge microdata from two (or more) files into a single matched file, the variants of which have been extensively studied. Among existing studies, we focused on Moriarity and Scheuren's (2001) method, which is a representative method of statistical matching for continuous data. We examined this method and proposed a revision to it by using a robust approach in the regression step of the procedure. We evaluated the efficiency of our revised method through simulation studies using both simulated and real data, which showed that the proposed method has distinct advantages over existing alternatives.

A New Statistical Linearization Technique of Nonlinear System (비선형시스템의 새로운 통계적 선형화방법)

  • Lee, Jang-Gyu;Lee, Yeon-Seok
    • Proceedings of the KIEE Conference
    • /
    • 1990.07a
    • /
    • pp.72-76
    • /
    • 1990
  • A new statistical linearization technique for nonlinear system called covariance matching method is proposed in this paper. The covariance matching method makes the mean and variance of an approximated output be identical real functional output, and the distribution of the approximated output have identical shape with a given random input. Also, the covariance matching method can be easily implemented for statistical analysis of nonlinear systems with a combination of linear system covariance analysis.

  • PDF

NONINFORMATIVE PRIORS FOR LINEAR COMBINATION OF THE INDEPENDENT NORMAL MEANS

  • Kang, Sang-Gil;Kim, Dal-Ho;Lee, Woo-Dong
    • Journal of the Korean Statistical Society
    • /
    • v.33 no.2
    • /
    • pp.203-218
    • /
    • 2004
  • In this paper, we develop the matching priors and the reference priors for linear combination of the means under the normal populations with equal variances. We prove that the matching priors are actually the second order matching priors and reveal that the second order matching priors match alternative coverage probabilities up to the second order (Mukerjee and Reid, 1999) and also, are HPD matching priors. It turns out that among all of the reference priors, one-at-a-time reference prior satisfies a second order matching criterion. Our simulation study indicates that one-at-a-time reference prior performs better than the other reference priors in terms of matching the target coverage probabilities in a frequentist sense. We compute Bayesian credible intervals for linear combination of the means based on the reference priors.

A Statistical Matching Method with k-NN and Regression

  • Chung, Sung-S.;Kim, Soon-Y.;Lee, Seung-S.;Lee, Ki-H.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.879-890
    • /
    • 2007
  • Statistical matching is a method of data integration for data sources that do not share the same units. It could produce rapidly lots of new information at low cost and decrease the response burden affecting the quality of data. This paper proposes a statistical matching technique combining k-NN (k-nearest neighborhood) and regression methods. We select k records in a donor file that have similarity in value with a specific observation of the common variable in a recipient file and estimate an imputation value for the recipient file, using regression modeling in the donor file. An empirical comparison study is conducted to show the properties of the proposed method.

  • PDF

Development of Noninformative Priors in the Burr Model

  • Cho, Jang-Sik;Kang, Sang-Gil;Baek, Sung-Uk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.1
    • /
    • pp.83-92
    • /
    • 2003
  • In this paper, we derive noninformative priors for the ratio of parameters in the Burr model. We obtain Jeffreys' prior, reference prior and second order probability matching prior. Also we prove that the noninformative prior matches the alternative coverage probabilities and a HPD matching prior up to the second order, respectively. Finally, we provide simulated frequentist coverage probabilities under the derived noninformative priors for small and moderate size of samples.

  • PDF