• Title/Summary/Keyword: logistic regression

Search Result 5,239, Processing Time 0.129 seconds

Graphical Diagnostics for Logistic Regression

  • Lee, Hak-Bae
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.05a
    • /
    • pp.213-217
    • /
    • 2003
  • In this paper we discuss graphical and diagnostic methods for logistic regression, in which the response is the number of successes in a fixed number of trials.

  • PDF

Performance Comparison of Logistic Regression Algorithms on RHadoop

  • Jung, Byung Ho;Lim, Dong Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.9-16
    • /
    • 2017
  • Machine learning has found widespread implementations and applications in many different domains in our life. Logistic regression is a type of classification in machine leaning, and is used widely in many fields, including medicine, economics, marketing and social sciences. In this paper, we present the MapReduce implementation of three existing algorithms, this is, Gradient Descent algorithm, Cost Minimization algorithm and Newton-Raphson algorithm, for logistic regression on RHadoop that integrates R and Hadoop environment applicable to large scale data. We compare the performance of these algorithms for estimation of logistic regression coefficients with real and simulated data sets. We also compare the performance of our RHadoop and RHIPE platforms. The performance experiments showed that our Newton-Raphson algorithm when compared to Gradient Descent and Cost Minimization algorithms appeared to be better to all data tested, also showed that our RHadoop was better than RHIPE in real data, and was opposite in simulated data.

Bayesian Logistic Regression for Human Detection (Human Detection 을 위한 Bayesian Logistic Regression)

  • Aurrahman, Dhi;Setiawan, Nurul Arif;Lee, Chil-Woo
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.569-572
    • /
    • 2008
  • The possibility to extent the solution in human detection problem for plug-in on vision-based Human Computer Interaction domain is very attractive, since the successful of the machine leaning theory and computer vision marriage. Bayesian logistic regression is a powerful classifier performing sparseness and high accuracy. The difficulties of finding people in an image will be conquered by implementing this Bavesian model as classifier. The comparison with other massive classifier e.g. SVM and RVM will introduce acceptance of this method for human detection problem. Our experimental results show the good performance of Bavesian logistic regression in human detection problem, both in trade-off curves (ROC, DET) and real-implementation compare to SVM and RVM.

  • PDF

Sparse Multinomial Kernel Logistic Regression

  • Shim, Joo-Yong;Bae, Jong-Sig;Hwang, Chang-Ha
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.1
    • /
    • pp.43-50
    • /
    • 2008
  • Multinomial logistic regression is a well known multiclass classification method in the field of statistical learning. More recently, the development of sparse multinomial logistic regression model has found application in microarray classification, where explicit identification of the most informative observations is of value. In this paper, we propose a sparse multinomial kernel logistic regression model, in which the sparsity arises from the use of a Laplacian prior and a fast exact algorithm is derived by employing a bound optimization approach. Experimental results are then presented to indicate the performance of the proposed procedure.

Study on Accident Prediction Models in Urban Railway Casualty Accidents Using Logistic Regression Analysis Model (로지스틱회귀분석 모델을 활용한 도시철도 사상사고 사고예측모형 개발에 대한 연구)

  • Jin, Soo-Bong;Lee, Jong-Woo
    • Journal of the Korean Society for Railway
    • /
    • v.20 no.4
    • /
    • pp.482-490
    • /
    • 2017
  • This study is a railway accident investigation statistic study with the purpose of prediction and classification of accident severity. Linear regression models have some difficulties in classifying accident severity, but a logistic regression model can be used to overcome the weaknesses of linear regression models. The logistic regression model is applied to escalator (E/S) accidents in all stations on 5~8 lines of the Seoul Metro, using data mining techniques such as logistic regression analysis. The forecasting variables of E/S accidents in urban railway stations are considered, such as passenger age, drinking, overall situation, behavior, and handrail grip. In the overall accuracy analysis, the logistic regression accuracy is explained 76.7%. According to the results of this analysis, it has been confirmed that the accuracy and the level of significance of the logistic regression analysis make it a useful data mining technique to establish an accident severity prediction model for urban railway casualty accidents.

Prediction of Galloping Accidents in Power Transmission Line Using Logistic Regression Analysis

  • Lee, Junghoon;Jung, Ho-Yeon;Koo, J.R.;Yoon, Yoonjin;Jung, Hyung-Jo
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.2
    • /
    • pp.969-980
    • /
    • 2017
  • Galloping is one of the most serious vibration problems in transmission lines. Power lines can be extensively damaged owing to aerodynamic instabilities caused by ice accretion. In this study, the accident probability induced by galloping phenomenon was analyzed using logistic regression analysis. As former studies have generally concluded, main factors considered were local weather factors and physical factors of power delivery systems. Since the number of transmission towers outnumbers the number of weather observatories, interpolation of weather factors, Kriging to be more specific, has been conducted in prior to forming galloping accident estimation model. Physical factors have been provided by Korea Electric Power Corporation, however because of the large number of explanatory variables, variable selection has been conducted, leaving total 11 variables. Before forming estimation model, with 84 provided galloping cases, 840 non-galloped cases were chosen out of 13 billion cases. Prediction model for accidents by galloping has been formed with logistic regression model and validated with 4-fold validation method, corresponding AUC value of ROC curve has been used to assess the discrimination level of estimation models. As the result, logistic regression analysis effectively discriminated the power lines that experienced galloping accidents from those that did not.

Penalized logistic regression models for determining the discharge of dyspnea patients (호흡곤란 환자 퇴원 결정을 위한 벌점 로지스틱 회귀모형)

  • Park, Cheolyong;Kye, Myo Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.125-133
    • /
    • 2013
  • In this paper, penalized binary logistic regression models are employed as statistical models for determining the discharge of 668 patients with a chief complaint of dyspnea based on 11 blood tests results. Specifically, the ridge model based on $L^2$ penalty and the Lasso model based on $L^1$ penalty are considered in this paper. In the comparison of prediction accuracy, our models are compared with the logistic regression models with all 11 explanatory variables and the selected variables by variable selection method. The results show that the prediction accuracy of the ridge logistic regression model is the best among 4 models based on 10-fold cross-validation.

Steal Success Model for 2007 Korean Professional Baseball Games (2007년 한국프로야구에서 도루성공모형)

  • Hong, Chong-Sun;Choi, Jeong-Min
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.455-468
    • /
    • 2008
  • Based on the huge baseball game records, the steal plays an important role to affect the result of games. For the research about success or failure of the steal in baseball games, logistic regression models are developed based on 2007 Korean professional baseball games. The analyses of logistic regression models are compared of those of the discriminant models. It is found that the performance of the logistic regression analysis is more efficient than that of the discriminant analysis. Also, we consider an alternative logistic regression model based on categorical data which are transformed from uneasy obtainable continuous data.

Evaluation of the Probability of Detection Surface for ODSCC in Steam Generator Tubes Using Multivariate Logistic Regression (다변량 로지스틱 회귀분석을 이용한 증기발생기 전열관 ODSCC의 POD곡면 분석)

  • Lee, Jae-Bong;Park, Jai-Hak;Kim, Hong-Deok;Chung, Han-Sub
    • Proceedings of the KSME Conference
    • /
    • 2007.05a
    • /
    • pp.250-255
    • /
    • 2007
  • Steam generator tubes play an important role in safety because they constitute one of the primary barriers between the radioactive and non-radioactive sides of the nuclear power plant. For this reason, the integrity of the tubes is essential in minimizing the leakage possibility of radioactive water. The integrity of the tubes is evaluated based on NDE (non-destructive evaluation) inspection results. Especially ECT (eddy current test) method is usually used for detecting the flaws in steam generator tubes. However, detection capacity of the NDE is not perfect and all of the "real flaws" which actually existing in steam generator tunes is not known by NDE results. Therefore reliability of NDE system is one of the essential parts in assessing the integrity of steam generators. In this study POD (probability of detection) of ECT system for ODSCC in steam generator tubes is evaluated using multivariate logistic regression. The cracked tube specimens are made using the withdrawn steam generator tubes. Therefore the cracks are not artificial but real. Using the multivariate logistic regression method, continuous POD surfaces are evaluated from hit (detection) and miss (no detection) binary data obtained from destructive and non-destructive evaluation of the cracked tubes. Length and depth of cracks are considered in multivariate logistic regression and their effects on detection capacity are evaluated.

  • PDF

Comparing Risk-adjusted In-hospital Mortality for Craniotomies : Logistic Regression versus Multilevel Analysis (로지스틱 회귀분석과 다수준 분석을 이용한 Craniotomy 환자의 사망률 평가결과의 일치도 분석)

  • Kim, Sun-Hee;Lee, Kwang-Soo
    • The Korean Journal of Health Service Management
    • /
    • v.9 no.2
    • /
    • pp.81-88
    • /
    • 2015
  • The purpose of this study was to compare the risk-adjusted in-hospital mortality for craniotomies between logistic regression and multilevel analysis. By using patient sample data from the Health Insurance Review & Assessment Service, in-patients with a craniotomy were selected as the survey target. The sample data were collected from a total number of 2,335 patients from 90 hospitals. The sample data were analyzed with SAS 9.3. From the results of the existing logistic regression analysis and multilevel analysis, the values from the multilevel analysis represented a better model than that of logistic regression. The intra-class correlation (ICC) was 18.0%. It was found that risk-adjusted in-hospital mortality for craniotomies may vary in every hospital. The agreement by kappa coefficient between the two methods was good for the risk-adjusted in-hospital mortality for craniotomies, but the factors influencing the outcome for that were different.