• Title/Summary/Keyword: 다중 로지스틱 회귀분석

Search Result 262, Processing Time 0.031 seconds

Multivariate Analysis for Clinicians (임상의를 위한 다변량 분석의 실제)

  • Oh, Joo Han;Chung, Seok Won
    • Clinics in Shoulder and Elbow
    • /
    • v.16 no.1
    • /
    • pp.63-72
    • /
    • 2013
  • In medical research, multivariate analysis, especially multiple regression analysis, is used to analyze the influence of multiple variables on the result. Multiple regression analysis should include variables in the model and the problem of multi-collinearity as there are many variables as well as the basic assumption of regression analysis. The multiple regression model is expressed as the coefficient of determination, $R^2$ and the influence of independent variables on result as a regression coefficient, ${\beta}$. Multiple regression analysis can be divided into multiple linear regression analysis, multiple logistic regression analysis, and Cox regression analysis according to the type of dependent variables (continuous variable, categorical variable (binary logit), and state variable, respectively), and the influence of variables on the result is evaluated by regression coefficient${\beta}$, odds ratio, and hazard ratio, respectively. The knowledge of multivariate analysis enables clinicians to analyze the result accurately and to design the further research efficiently.

Principal Components Regression in Logistic Model (로지스틱모형에서의 주성분회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.571-580
    • /
    • 2008
  • The logistic regression analysis is widely used in the area of customer relationship management and credit risk management. It is well known that the maximum likelihood estimation is not appropriate when multicollinearity exists among the regressors. Thus we propose the logistic principal components regression to deal with the multicollinearity problem. In particular, new method is suggested to select proper principal components. The selection method is based on the condition index instead of the eigenvalue. When a condition index is larger than the upper limit of cutoff value, principal component corresponding to the index is removed from the estimation. And hypothesis test is sequentially employed to eliminate the principal component when a condition index is between the upper limit and the lower limit. The limits are obtained by a linear model which is constructed on the basis of the conjoint analysis. The proposed method is evaluated by means of the variance of the estimates and the correct classification rate. The results indicate that the proposed method is superior to the existing method in terms of efficiency and goodness of fit.

Development of model for prediction of land sliding at steep slopes (급경사지 붕괴 예측을 위한 모형 개발)

  • Park, Ki-Byung;Joo, Yong-Sung;Park, Dug-Keun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.691-699
    • /
    • 2011
  • Land sliding is one of well-known nature disaster. As a part of effort to reduce damage from land sliding, many researchers worked on increasing prediction ability. However, because previous studies are conducted mostly by non-statisticians, previously proposed models were hardly statistically justifiable. In this paper, we predicted the probability of land sliding using the logistic regression model. Since most explanatory variables under consideration were correlated, we proposed the final model after backward elimination process.

Analysis of Predictors of Phonological Variation Realization (음운 변동 실현 오류의 예측 인자 분석)

  • An, Sung-min
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.498-500
    • /
    • 2021
  • 본 연구에서는 음운 변동에서 나타나는 오류가 어떤 변수에 영향을 받는지 확인하여 음운 변동 연구 및 교육의 기초 자료를 제공하고자 하는 데에 목적이다. 이를 위해 유음화 발음 데이터를 이용하여 성별, 유음화의 방향, 품사, 단어의 빈도, 단어의 음절수와 유음화의 발음 적격 유무를 변수로 설정하였다. 유음화 적격률에 영향을 줄 수 있는 독립변수를 찾기 위해 카이제곱 검정과 다중공선성의 팽창계수를 먼저 확인하였다. 이후 다중 로지스틱 회귀분석과 오즈비를 통해 유의한 예측인자를 검토하였다. 그 결과 5개의 독립 변수 중 성별과 유음화의 방향, 품사가 결과를 오류에 영향을 주는 주요한 인자가 되는 것을 확인할 수 있었다.

  • PDF

Logistic Regressions with Sensory Evaluation Data about Hanwoo Steer Beef (한우 거세우 고기 관능평가 데이터의 로지스틱 회귀분석)

  • Lee, Hye-Jung;Kim, Jae-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.857-870
    • /
    • 2010
  • This study was conducted to investigate the relationship between the socio-demographic factors and the Korean consumers palatability evaluation grades with Hanwoo sensory evaluation data from 2006 to 2008 by National Institute of Animal Science. The dichotomy logistic regression model and the multinomial logistic regression model are fitted with the independent variables such as the consumer living location, age, gender occupation, monthly income, beef cut and the the palatability grade as the categorical dependent variable and tenderness, 리avor and juiciness as the continuous dependent variable. Stepwise variable selection procedure is incorporated to find the final model and odds ratios are calculated to nd the associations between categories.

Comparison of Multinomial Logit and Logistic Regression on Disability Pensioners' Characteristic (다범주 자료의 다항로짓 모형과 로지스틱 회귀모형 비교;장애연금 특성분석 중심으로)

  • Kim, Mi-Jung
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.589-602
    • /
    • 2008
  • This article studies on disability pensioners' characteristic with multinomial logit and logistic regression model. Seven factors are examined on whether each factor is reflected in degree of disability in the disability pension. By incorporating multinomial logit and logistic regression model, effectiveness and characteristic of the seven factors are investigated on the degree of disability. Result shows all the seven factors are significant on the degree of disability, while among the seven, five factors, age, sex, type of coverage, type of category, insured duration show a trend in degree of disability and the other two, cause of disability and class of standard monthly income are not effective on trend in degree of disability. Results from analyses might be useful for disability pension management.

Principal Components Logistic Regression based on Robust Estimation (로버스트추정에 바탕을 둔 주성분로지스틱회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook;Jang, Hea-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.531-539
    • /
    • 2009
  • Logistic regression is widely used as a datamining technique for the customer relationship management. The maximum likelihood estimator has highly inflated variance when multicollinearity exists among the regressors, and it is not robust against outliers. Thus we propose the robust principal components logistic regression to deal with both multicollinearity and outlier problem. A procedure is suggested for the selection of principal components, which is based on the condition index. When a condition index is larger than the cutoff value obtained from the model constructed on the basis of the conjoint analysis, the corresponding principal component is removed from the logistic model. In addition, we employ an algorithm for the robust estimation, which strives to dampen the effect of outliers by applying the appropriate weights and factors to the leverage points and vertical outliers identified by the V-mask type criterion. The Monte Carlo simulation results indicate that the proposed procedure yields higher rate of correct classification than the existing method.

A Comparative Experiment of Software Defect Prediction Models using Object Oriented Metrics (객체지향 메트릭을 이용한 결함 예측 모형의 실험적 비교)

  • Kim, Yun-Kyu;Kim, Tae-Yeon;Chae, Heung-Seok
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.8
    • /
    • pp.596-600
    • /
    • 2009
  • To support an efficient management of software verification and validation activities, many defect prediction models have been proposed based on object oriented metrics. They usually adopt logistic regression analysis, And, they state that the correctness of prediction is about 60${\sim}$70%, We performed a similar experiment with Eclipse 3.3 to check their prediction effectiveness, However, the result shows that correctness is about 40% which is much lower than the original results. We also found that univariate logistic regression analysis produces better results than multivariate logistic regression analysis.

Machine-Learning Evaluation of Factors Influencing Landslides (머신러닝기법을 이용한 산사태 발생인자의 영향도 분석)

  • Park, Seong-Yong;Moon, Seong-Woo;Choi, Jaewan;Seo, Yong-Seok
    • The Journal of Engineering Geology
    • /
    • v.31 no.4
    • /
    • pp.701-718
    • /
    • 2021
  • Geological field surveys and a series of laboratory tests were conducted to obtain data related to landslides in Sancheok-myeon, Chungju-si, Chungcheongbuk-do, South Korea where many landslides occurred in the summer of 2020. The magnitudes of various factors' influence on landslide occurrence were evaluated using logistic regression analysis and an artificial neural network. Undisturbed specimens were sampled according to landslide occurrence, and dynamic cone penetration testing measured the depth of the soil layer during geological field surveys. Laboratory tests were performed following the standards of ASTM International. To solve the problem of multicollinearity, the variation inflation factor was calculated for all factors related to landslides, and then nine factors (shear strength, lithology, saturated water content, specific gravity, hydraulic conductivity, USCS, slope angle, and elevation) were determined as influential factors for consideration by machine learning techniques. Minimum-maximum normalization compared factors directly with each other. Logistic regression analysis identified soil depth, slope angle, saturated water content, and shear strength as having the greatest influence (in that order) on the occurrence of landslides. Artificial neural network analysis ranked factors by greatest influence in the order of slope angle, soil depth, saturated water content, and shear strength. Arithmetically averaging the effectiveness of both analyses found slope angle, soil depth, saturated water content, and shear strength as the top four factors. The sum of their effectiveness was ~70%.

Major Factors Influencing Landslide Occurrence along a Forest Road Determined Using Structural Equation Model Analysis and Logistic Regression Analysis (구조방정식과 로지스틱 회귀분석을 이용한 임도비탈면 산사태의 주요 영향인자 선정)

  • Kim, Hyeong-Sin;Moon, Seong-Woo;Seo, Yong-Seok
    • The Journal of Engineering Geology
    • /
    • v.32 no.4
    • /
    • pp.585-596
    • /
    • 2022
  • This study determined major factors influencing landslide occurrence along a forest road near Sangsan village, Sancheok-myeon, Chungju-si, Chungcheongbuk-do, South Korea. Within a 2 km radius of the study area, landslides occur intensively during periods of heavy rainfall (August 2020). This makes study of the area advantageous, as it allows examination of the influence of only geological and tomographic factors while excluding the effects of rainfall and vegetation. Data for 82 locations (37 experiencing landslides and 45 not) were obtained from geological surveys, laboratory tests, and geo-spatial analysis. After some data preprocessing (e.g., error filtering, minimum-maximum normalization, and multicollinearity), structural equation model (SEM) and logistic regression (LR) analyses were conducted. These showed the regolith thickness, porosity, and saturated unit weight to be the factors most influential of landslide risk in the study area. The sums of the influence magnitudes of these factors are 71% in SEM and 83% in LR.