• Title/Summary/Keyword: logistic regression

Search Result 5,239, Processing Time 0.116 seconds

Fine-Grain Weighted Logistic Regression Model (가중치 세분화 기반의 로지스틱 회귀분석 모델)

  • Lee, Chang-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.9
    • /
    • pp.77-81
    • /
    • 2016
  • Logistic regression (LR) has been widely used for predicting the relationships among variables in various fields. We propose a new logistic regression model with a fine-grained weighting method, called value weighted logistic regression, by assigning different weights to each feature value. A gradient approach is utilized to obtain the optimal weights of feature values. We conduct experiments on several data sets and the experimental results show that the proposed method shows meaningful improvement in prediction accuracy.

Information Theoretic Standardized Logistic Regression Coefficients with Various Coefficients of Determination

  • Hong Chong-Sun;Ryu Hyeon-Sang
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.1
    • /
    • pp.49-60
    • /
    • 2006
  • There are six approaches to constructing standardized coefficient for logistic regression. The standardized coefficient based on Kruskal's information theory is known to be the best from a conceptual standpoint. In order to calculate this standardized coefficient, the coefficient of determination based on entropy loss is used among many kinds of coefficients of determination for logistic regression. In this paper, this standardized coefficient is obtained by using four kinds of coefficients of determination which have the most intuitively reasonable interpretation as a proportional reduction in error measure for logistic regression. These four kinds of the sixth standardized coefficient are compared with other kinds of standardized coefficients.

Performance Comparison of Mahalanobis-Taguchi System and Logistic Regression : A Case Study (마할라노비스-다구치 시스템과 로지스틱 회귀의 성능비교 : 사례연구)

  • Lee, Seung-Hoon;Lim, Geun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.39 no.5
    • /
    • pp.393-402
    • /
    • 2013
  • The Mahalanobis-Taguchi System (MTS) is a diagnostic and predictive method for multivariate data. In the MTS, the Mahalanobis space (MS) of reference group is obtained using the standardized variables of normal data. The Mahalanobis space can be used for multi-class classification. Once this MS is established, the useful set of variables is identified to assist in the model analysis or diagnosis using orthogonal arrays and signal-to-noise ratios. And other several techniques have already been used for classification, such as linear discriminant analysis and logistic regression, decision trees, neural networks, etc. The goal of this case study is to compare the ability of the Mahalanobis-Taguchi System and logistic regression using a data set.

APPLICATION OF LOGISTIC REGRESSION MODEL AND ITS VALIDATION FOR LANDSLIDE SUSCEPTIBILITY MAPPING USING GIS AND REMOTE SENSING DATA AT PENANG, MALAYSIA

  • LEE SARO
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.310-313
    • /
    • 2004
  • The aim of this study is to evaluate the hazard of landslides at Penang, Malaysia, using a Geographic Information System (GIS) and remote sensing. Landslide locations were identified in the study area from interpretation of aerial photographs and from field surveys. Topographical and geological data and satellite images were collected, processed, and constructed into a spatial database using GIS and image processing. The factors chosen that influence landslide occurrence were: topographic slope, topographic aspect, topographic curvature and distance from drainage, all from the topographic database; lithology and distance from lineament, taken from the geologic database; land use from TM satellite images; and the vegetation index value from SPOT satellite images. Landslide hazardous area were analysed and mapped using the landslide-occurrence factors by logistic regression model. The results of the analysis were verified using the landslide location data and compared with probabilistic model. The validation results showed that the logistic regression model is better prediction accuracy than probabilistic model.

  • PDF

Analyzing Survival Data as Binary Outcomes with Logistic Regression

  • Lim, Jo-Han;Lee, Kyeong-Eun;Hahn, Kyu-S.;Park, Kun-Woo
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.117-126
    • /
    • 2010
  • Clinical researchers often analyze survival data as binary outcomes using the logistic regression method. This paper examines the information loss resulting from analyzing survival time as binary outcomes. We first demonstrate that, under the proportional hazard assumption, this binary discretization does result in a significant information loss. Second, when fitting a logistic model to survival time data, researchers inadvertently use the maximal statistic. We implement a numerical study to examine the properties of the reference distribution for this statistic, finally, we show that the logistic regression method can still be a useful tool for analyzing survival data in particular when the proportional hazard assumption is questionable.

Comparison Study for Data Fusion and Clustering Classification Performances (다구찌 디자인을 이용한 데이터 퓨전 및 군집분석 분류 성능 비교)

  • 신형원;손소영
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.601-604
    • /
    • 2000
  • In this paper, we compare the classification performance of both data fusion and clustering algorithms (Data Bagging, Variable Selection Bagging, Parameter Combining, Clustering) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are (1) correlation among input variables (2) variance of observation (3) training data size and (4) input-output function. Since the relationship between input & output is not typically known, we use Taguchi design to improve the practicality of our study results by letting it as a noise factor. Experimental study results indicate the following: Clustering based logistic regression turns out to provide the highest classification accuracy when input variables are weakly correlated and the variance of data is high. When there is high correlation among input variables, variable bagging performs better than logistic regression. When there is strong correlation among input variables and high variance between observations, bagging appears to be marginally better than logistic regression but was not significant.

  • PDF

Application of Crossover Analysis-logistic Regression in the Assessment of Gene- environmental Interactions for Colorectal Cancer

  • Wu, Ya-Zhou;Yang, Huan;Zhang, Ling;Zhang, Yan-Qi;Liu, Ling;Yi, Dong;Cao, Jia
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.5
    • /
    • pp.2031-2037
    • /
    • 2012
  • Background: Analysis of gene-gene and gene-environment interactions for complex multifactorial human disease faces challenges regarding statistical methodology. One major difficulty is partly due to the limitations of parametric-statistical methods for detection of gene effects that are dependent solely or partially on interactions with other genes or environmental exposures. Based on our previous case-control study in Chongqing of China, we have found increased risk of colorectal cancer exists in individuals carrying a novel homozygous TT at locus rs1329149 and known homozygous AA at locus rs671. Methods: In this study, we proposed statistical method-crossover analysis in combination with logistic regression model, to further analyze our data and focus on assessing gene-environmental interactions for colorectal cancer. Results: The results of the crossover analysis showed that there are possible multiplicative interactions between loci rs671 and rs1329149 with alcohol consumption. Multifactorial logistic regression analysis also validated that loci rs671 and rs1329149 both exhibited a multiplicative interaction with alcohol consumption. Moreover, we also found additive interactions between any pair of two factors (among the four risk factors: gene loci rs671, rs1329149, age and alcohol consumption) through the crossover analysis, which was not evident on logistic regression. Conclusions: In conclusion, the method based on crossover analysis-logistic regression is successful in assessing additive and multiplicative gene-environment interactions, and in revealing synergistic effects of gene loci rs671 and rs1329149 with alcohol consumption in the pathogenesis and development of colorectal cancer.

An Analysis of Factors Relating to Agricultural Machinery Farm-Work Accidents Using Logistic Regression

  • Kim, Byounggap;Yum, Sunghyun;Kim, Yu-Yong;Yun, Namkyu;Shin, Seung-Yeoub;You, Seokcheol
    • Journal of Biosystems Engineering
    • /
    • v.39 no.3
    • /
    • pp.151-157
    • /
    • 2014
  • Purpose: In order to develop strategies to prevent farm-work accidents relating to agricultural machinery, influential factors were examined in this paper. The effects of these factors were quantified using logistic regression. Methods: Based on the results of a survey on farm-work accidents conducted by the National Academy of Agricultural Science, 21 tentative independent variables were selected. To apply these variables to regression, the presence of multicollinearity was examined by comparing correlation coefficients, checking the statistical significance of the coefficients in a simple linear regression model, and calculating the variance inflation factor. A logistic regression model and determination method of its goodness of fit was defined. Results: Among 21 independent variables, 13 variables were not collinear each other. The results of a logistic regression analysis using these variables showed that the model was significant and acceptable, with deviance of 714.053. Parameter estimation results showed that four variables (age, power tiller ownership, cognizance of the government's safety policy, and consciousness of safety) were significant. The logistic regression model predicted that the former two increased accident odds by 1.027 and 8.506 times, respectively, while the latter two decreased the odds by 0.243 and 0.545 times, respectively. Conclusions: Prevention strategies against factors causing an accident, such as the age of farmers and the use of a power tiller, are necessary. In addition, more efficient trainings to elevate the farmer's consciousness about safety must be provided.

Logistic regression model for major separation rate

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.129-138
    • /
    • 2002
  • This paper deals with logistic regression models for analysing separation rates from majors. The model building procedure shows how to incoporate the effects of some factors causing from three-way nested sampling scheme and discusses what type of characteristics as independent variables directly affecting the rates should be considered.

  • PDF