• 제목/요약/키워드: binary logistic regression

검색결과 419건 처리시간 0.032초

A Study on the Power Comparison between Logistic Regression and Offset Poisson Regression for Binary Data

  • Kim, Dae-Youb;Park, Heung-Sun
    • Communications for Statistical Applications and Methods
    • /
    • 제19권4호
    • /
    • pp.537-546
    • /
    • 2012
  • In this paper, for analyzing binary data, Poisson regression with offset and logistic regression are compared with respect to the power via simulations. Poisson distribution can be used as an approximation of binomial distribution when n is large and p is small; however, we investigate if the same conditions can be held for the power of significant tests between logistic regression and offset poisson regression. The result is that when offset size is large for rare events offset poisson regression has a similar power to logistic regression, but it has an acceptable power even with a moderate prevalence rate. However, with a small offset size (< 10), offset poisson regression should be used with caution for rare events or common events. These results would be good guidelines for users who want to use offset poisson regression models for binary data.

Analyzing Survival Data as Binary Outcomes with Logistic Regression

  • Lim, Jo-Han;Lee, Kyeong-Eun;Hahn, Kyu-S.;Park, Kun-Woo
    • Communications for Statistical Applications and Methods
    • /
    • 제17권1호
    • /
    • pp.117-126
    • /
    • 2010
  • Clinical researchers often analyze survival data as binary outcomes using the logistic regression method. This paper examines the information loss resulting from analyzing survival time as binary outcomes. We first demonstrate that, under the proportional hazard assumption, this binary discretization does result in a significant information loss. Second, when fitting a logistic model to survival time data, researchers inadvertently use the maximal statistic. We implement a numerical study to examine the properties of the reference distribution for this statistic, finally, we show that the logistic regression method can still be a useful tool for analyzing survival data in particular when the proportional hazard assumption is questionable.

Binary Forecast of Heavy Snow Using Statistical Models

  • Sohn, Keon-Tae
    • Communications for Statistical Applications and Methods
    • /
    • 제13권2호
    • /
    • pp.369-378
    • /
    • 2006
  • This Study focuses on the binary forecast of occurrence of heavy snow in Honam area based on the MOS(model output statistic) method. For our study daily amount of snow cover at 17 stations during the cold season (November to March) in 2001 to 2005 and Corresponding 45 RDAPS outputs are used. Logistic regression model and neural networks are applied to predict the probability of occurrence of Heavy snow. Based on the distribution of estimated probabilities, optimal thresholds are determined via true shill score. According to the results of comparison the logistic regression model is recommended.

엑셀 VBA를 이용한 이분형 로지스틱 회귀모형 교육도구 개발 (An educational tool for binary logistic regression model using Excel VBA)

  • 박철용;최현석
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권2호
    • /
    • pp.403-410
    • /
    • 2014
  • 이분형 로지스틱 회귀분석은 양적 혹은 질적 설명변수를 이용해서 이분형 반응변수를 설명하는 하나의 통계적인 기법이다. 이 모형에서는 반응변수가 1이 될 확률을 설명변수들의 선형결합의 변환(혹은 함수)으로 설명하고자 한다. 이 개념에 대한 이해가 비통계학자들이 이분형 로지스틱 회귀모형을 이해하는데 있어서 넘어야 할 커다란 장벽 중의 하나이다. 이 연구에서는 이분형 로지스틱 회귀모형의 필요성을 엑셀 VBA를 이용하여 설명하는 교육도구를 개발하고자 한다. 반응변수가 1이 될 확률을 설명변수의 선형함수로 모형화 할 때의 문제점과 선형결합에 대한 변환을 통해 이 문제점이 어떻게 해소되는지 보여준다.

Supervised Learning-Based Collaborative Filtering Using Market Basket Data for the Cold-Start Problem

  • Hwang, Wook-Yeon;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • 제13권4호
    • /
    • pp.421-431
    • /
    • 2014
  • The market basket data in the form of a binary user-item matrix or a binary item-user matrix can be modelled as a binary classification problem. The binary logistic regression approach tackles the binary classification problem, where principal components are predictor variables. If users or items are sparse in the training data, the binary classification problem can be considered as a cold-start problem. The binary logistic regression approach may not function appropriately if the principal components are inefficient for the cold-start problem. Assuming that the market basket data can also be considered as a special regression problem whose response is either 0 or 1, we propose three supervised learning approaches: random forest regression, random forest classification, and elastic net to tackle the cold-start problem, comparing the performance in a variety of experimental settings. The experimental results show that the proposed supervised learning approaches outperform the conventional approaches.

이원 이항 계수치 자료의 로지스틱 회귀 분석 (A Logistic Regression Analysis of Two-Way Binary Attribute Data)

  • 안해일
    • 산업경영시스템학회지
    • /
    • 제35권3호
    • /
    • pp.118-128
    • /
    • 2012
  • An attempt is given to the problem of analyzing the two-way binary attribute data using the logistic regression model in order to find a sound statistical methodology. It is demonstrated that the analysis of variance (ANOVA) may not be good enough, especially for the case that the proportion is very low or high. The logistic transformation of proportion data could be a help, but not sound in the statistical sense. Meanwhile, the adoption of generalized least squares (GLS) method entails much to estimate the variance-covariance matrix. On the other hand, the logistic regression methodology provides sound statistical means in estimating related confidence intervals and testing the significance of model parameters. Based on simulated data, the efficiencies of estimates are ensured with a view to demonstrate the usefulness of the methodology.

연속적 이항 로지스틱 회귀모형을 이용한 R&D 투입 및 성과 관계에 대한 실증분석 (Empirical Analysis on the Relationship between R&D Inputs and Performance Using Successive Binary Logistic Regression Models)

  • 박성민
    • 대한산업공학회지
    • /
    • 제40권3호
    • /
    • pp.342-357
    • /
    • 2014
  • The present study analyzes the relationship between research and development (R&D) inputs and performance of a national technology innovation R&D program using successive binary Logistic regression models based on a typical R&D logic model. In particular, this study focuses on to answer the following three main questions; (1) "To what extent, do the R&D inputs have an effect on the performance creation?"; (2) "Is an obvious relationship verified between the immediate predecessor and its successor performance?"; and (3) "Is there a difference in the performance creation between R&D government subsidy recipient types and between R&D collaboration types?" Methodologically, binary Logistic regression models are established successively considering the "Success-Failure" binary data characteristic regarding the performance creation. An empirical analysis is presented analyzing the sample n = 2,178 R&D projects completed. This study's major findings are as follows. First, the R&D inputs have a statistically significant relationship only with the short-term, technical output, "Patent Registration." Second, strong dependencies are identified between the immediate predecessor and its successor performance. Third, the success probability of the performance creation is statistically significantly different between the R&D types aforementioned. Specifically, compared with "Large Company", "Small and Medium-Sized Enterprise (SMS)" shows a greater success probability of "Sales" and "New Employment." Meanwhile, "R&D Collaboration" achieves a larger success probability of "Patent Registration" and "Sales."

Collapsibility and Suppression for Cumulative Logistic Model

  • Hong, Chong-Sun;Kim, Kil-Tae
    • Communications for Statistical Applications and Methods
    • /
    • 제12권2호
    • /
    • pp.313-322
    • /
    • 2005
  • In this paper, we discuss suppression for logistic regression model. Suppression for linear regression model was defined as the relationship among sums of squared for regression as well as correlation coefficients of. variables. Since it is not common to obtain simple correlation coefficient for binary response variable of logistic model, we consider cumulative logistic models with multinomial and ordinal response variables rather than usual logistic model. As number of category of a response variable for the cumulative logistic model gets collapsed into binary, it is found that suppressions for these logistic models are changed. These suppression results for cumulative logistic models are discussed and compared with those of linear model.

Blur Detection through Multinomial Logistic Regression based Adaptive Threshold

  • Mahmood, Muhammad Tariq;Siddiqui, Shahbaz Ahmed;Choi, Young Kyu
    • 반도체디스플레이기술학회지
    • /
    • 제18권4호
    • /
    • pp.110-115
    • /
    • 2019
  • Blur detection and segmentation play vital role in many computer vision applications. Among various methods, local binary pattern based methods provide reasonable blur detection results. However, in conventional local binary pattern based methods, the blur map is computed by using a fixed threshold irrespective of the type and level of blur. It may not be suitable for images with variations in imaging conditions and blur. In this paper we propose an effective method based on local binary pattern with adaptive threshold for blur detection. The adaptive threshold is computed based on the model learned through the multinomial logistic regression. The performance of the proposed method is evaluated using different datasets. The comparative analysis not only demonstrates the effectiveness of the proposed method but also exhibits it superiority over the existing methods.

A Bayesian Method for Narrowing the Scope of Variable Selection in Binary Response Logistic Regression

  • Kim, Hea-Jung;Lee, Ae-Kyung
    • 품질경영학회지
    • /
    • 제26권1호
    • /
    • pp.143-160
    • /
    • 1998
  • This article is concerned with the selection of subsets of predictor variables to be included in bulding the binary response logistic regression model. It is based on a Bayesian aproach, intended to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure reformulates the logistic regression setup in a hierarchical normal mixture model by introducing a set of hyperparameters that will be used to identify subset choices. It is done by use of the fact that cdf of logistic distribution is a, pp.oximately equivalent to that of $t_{(8)}$/.634 distribution. The a, pp.opriate posterior probability of each subset of predictor variables is obtained by the Gibbs sampler, which samples indirectly from the multinomial posterior distribution on the set of possible subset choices. Thus, in this procedure, the most promising subset of predictors can be identified as that with highest posterior probability. To highlight the merit of this procedure a couple of illustrative numerical examples are given.

  • PDF