• Title/Summary/Keyword: 베이지안 회귀분석

Search Result 73, Processing Time 0.019 seconds

기업부도예측을 위한 통합알고리즘

  • Bae Jae-Gwon;Kim Jin-Hwa
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2006.06a
    • /
    • pp.195-202
    • /
    • 2006
  • 본 연구에서는 보다 효과적인 기업부도예측을 위하여, 동계적 방법과 인공지능 방법을 결합한 통합모형을 제시하였다. 이를 위하여 통계적인 모형 중에서 가장 널리 활용되고 있는 다변량 판별분석, 로지스틱 회귀분석과 인공 지능적인 방법으로서 최근 널리 사용되고 있는 인공신경망, 규칙유도기법, 베이지안 망의 5가지 방법론을 통합한 Voting with Performance & Weights from ANN(WP-ANN) 통합모형을 제시하였다. 실험결과, 본 연구에서 제안한 WP-ANN 통합모형은 다변량 판별분석, 로지스탁 회귀분석, 인공신경망, 규칙유도기법, 베이지안 망 등의 단일모형과 비교한 결과 가장 예측정확성이 유수한 것으로 나타났다. 따라서 본 연구를 통해 기업부도예측에 있어서 WP-ANN 통합모형이 기존의 모형들에 비해 우수한 예측정확성을 나타냄을 알 수 있었다.

  • PDF

Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data (보조 혼합 샘플링을 이용한 베이지안 로지스틱 회귀모형 : 당뇨병 자료에 적용 및 분류에서의 성능 비교)

  • Rhee, Eun Hee;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.131-146
    • /
    • 2022
  • Logit models are commonly used to predicting and classifying categorical response variables. Most Bayesian approaches to logit models are implemented based on the Metropolis-Hastings algorithm. However, the algorithm has disadvantages of slow convergence and difficulty in ensuring adequacy for the proposal distribution. Therefore, we use auxiliary mixture sampler proposed by Frühwirth-Schnatter and Frühwirth (2007) to estimate logit models. This method introduces two sequences of auxiliary latent variables to make logit models satisfy normality and linearity. As a result, the method leads that logit model can be easily implemented by Gibbs sampling. We applied the proposed method to diabetes data from the Community Health Survey (2020) of the Korea Disease Control and Prevention Agency and compared performance with Metropolis-Hastings algorithm. In addition, we showed that the logit model using auxiliary mixture sampling has a great classification performance comparable to that of the machine learning models.

Development of Bayesian Multiple Quantile Regression model and Estimation fo Future Design Rainfall with Increased Temperature (베이지안 다중분위회귀분석모형 개발 및 온도상승에 따른 미래 확률강수량 전망)

  • Uranchimeg, Sumiya;Kim, Jin-Guk;Kwon, Hyun-Han
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.22-22
    • /
    • 2019
  • 최근 전 세계적으로 급증하는 기후변화의 영향으로 인해 강우량 증가에 따른 이상홍수 발생 및 댐 여유고 부족 등 다양한 위험인자가 노출되고 있다. 이러한 예상치 못한 이상홍수는 실제 거주하고 있는 사람들을 위협할 수 있으며, 하천 범람으로 인해 2차 3차 피해가 일어날 가능성이 존재하고 있다. 이에 다양한 자연재해로부터 인명 및 재산 피해를 방지 및 저감하기 위한 목적으로 다양한 수공구조물이 존재하며, 수자원 관리계획 수립의 목적에 따라 다양한 강수량이 활용되고 있다. 특히, 지구온난화에 따른 기후변화 영향을 고려한 연최대 강수량 및 확률강수량 산정이 필요한 시점이며, 온도변화에 따른 증기압 계산식인 Clausius-Clapeyron 관계에 따르면 대기 온도가 $1^{\circ}C$ 상승할 때 대기수분량이 6~7% 증가하여 평균 온도상승에 따라 극치강수량 발생 잠재력이 향상 될 것으로 전망되고 있다. 본 연구에서는 온도상승에 따른 극치강수량의 변화를 베이지안 다중분위회귀분석모형을 통해 산정하여 CORDEX 온도자료 기반의 미래 극치강수량을 전망하였다. 본 연구결과 100년 이상 빈도의 강수량은 온도상승에 따라 급격히 증가하는 추세를 확인하였으며, 2100년까지 온도상승을 고려한 최대 극치강수량은 1500mm를 넘을 가능성을 확인하였다.

  • PDF

A study on SOH estimation of Lithium-ion battery based on Bayesian Regression. (베이지안 회귀분석을 이용한 리튬이온 배터리의 SOH 추정 방법 연구)

  • Park, Seongyun;Kim, Jonghoon;Park, Sungbeak;Kim, Youngmi
    • Proceedings of the KIPE Conference
    • /
    • 2019.07a
    • /
    • pp.53-55
    • /
    • 2019
  • 리튬 이온 배터리가 소형 모바일 기기, 전기 자동차, 에너지 저장장치 등에 상용화됨에 따라서 이의 충전 상태(SOC) 추정 및 셀, 모듈의 건전성(SOH)의 예측이 배터리 사용 기기의 관리 지표로 사용되고 있다. 리튬 이온 배터리는 여러 차례의 방전으로 노화되어 기기의 요구 부하를 공급가능한지 지표로 평가되어야 한다. 정확한 SOH 추정을 위해 리튬 이온 배터리의 방전 용량 실험이 주기적으로 진행되어야 하며, 이를 통해 오프라인 기반의 SOH 추정이 가능해진다. 본 논문에서는 베이지안 회귀분석 방법을 이용하여 오프라인 SOH 추정을 진행하기 위해 방전 용량을 추정하였으며, 고출력 배터리인 18650 25R셀을 이용하여 방전 용량 추정 결과 방전 전류 1 C-rate에서 1%, 2 C-rate에서 2%의 추정 오차율을 나타냈다.

  • PDF

Bayesian ordinal probit semiparametric regression models: KNHANES 2016 data analysis of the relationship between smoking behavior and coffee intake (베이지안 순서형 프로빗 준모수 회귀 모형 : 국민건강영양조사 2016 자료를 통한 흡연양태와 커피섭취 간의 관계 분석)

  • Lee, Dasom;Lee, Eunji;Jo, Seogil;Choi, Taeryeon
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.25-46
    • /
    • 2020
  • This paper presents ordinal probit semiparametric regression models using Bayesian Spectral Analysis Regression (BSAR) method. Ordinal probit regression is a way of modeling ordinal responses - usually more than two categories - by connecting the probability of falling into each category explained by a combination of available covariates using a probit (an inverse function of normal cumulative distribution function) link. The Bayesian probit model facilitates posterior sampling by bringing a latent variable following normal distribution, therefore, the responses are categorized by the cut-off points according to values of latent variables. In this paper, we extend the latent variable approach to a semiparametric model for the Bayesian ordinal probit regression with nonparametric functions using a spectral representation of Gaussian processes based BSAR method. The latent variable is decomposed into a parametric component and a nonparametric component with or without a shape constraint for modeling ordinal responses and predicting outcomes more flexibly. We illustrate the proposed methods with simulation studies in comparison with existing methods and real data analysis applied to a Korean National Health and Nutrition Examination Survey (KNHANES) 2016 for investigating nonparametric relationship between smoking behavior and coffee intake.

Nomogram comparison conducted by logistic regression and naïve Bayesian classifier using type 2 diabetes mellitus (T2D) (제 2형 당뇨병을 이용한 로지스틱과 베이지안 노모그램 구축 및 비교)

  • Park, Jae-Cheol;Kim, Min-Ho;Lee, Jea-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.5
    • /
    • pp.573-585
    • /
    • 2018
  • In this study, we fit the logistic regression model and naïve Bayesian classifier model using 11 risk factors to predict the incidence rate probability for type 2 diabetes mellitus. We then introduce how to construct a nomogram that can help people visually understand it. We use data from the 2013-2015 Korean National Health and Nutrition Examination Survey (KNHANES). We take 3 interactions in the logistic regression model to improve the quality of the analysis and facilitate the application of the left-aligned method to the Bayesian nomogram. Finally, we compare the two nomograms and examine their utility. Then we verify the nomogram using the ROC curve.

Comparison of nomogram construction methods using chronic obstructive pulmonary disease (만성 폐쇄성 폐질환을 이용한 노모그램 구축과 비교)

  • Seo, Ju-Hyun;Lee, Jea-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.3
    • /
    • pp.329-342
    • /
    • 2018
  • Nomogram is a statistical tool that visualizes the risk factors of the disease and then helps to understand the untrained people. This study used risk factors of chronic obstructive pulmonary disease (COPD) and compared with logistic regression model and naïve Bayesian classifier model. Data were analyzed using the Korean National Health and Nutrition Examination Survey 6th (2013-2015). First, we used 6 risk factors about COPD. We constructed nomogram using logistic regression model and naïve Bayesian classifier model. We also compared the nomograms constructed using the two methods to find out which method is more appropriate. The receiver operating characteristic curve and the calibration plot were used to verify each nomograms.

Development of Pedestrian Fatality Model using Bayesian-Based Neural Network (베이지안 신경망을 이용한 보행자 사망확률모형 개발)

  • O, Cheol;Gang, Yeon-Su;Kim, Beom-Il
    • Journal of Korean Society of Transportation
    • /
    • v.24 no.2 s.88
    • /
    • pp.139-145
    • /
    • 2006
  • This paper develops pedestrian fatality models capable of producing the probability of pedestrian fatality in collision between vehicles and pedestrians. Probabilistic neural network (PNN) and binary logistic regression (BLR) ave employed in modeling pedestrian fatality pedestrian age, vehicle type, and collision speed obtained from reconstructing collected accidents are used as independent variables in fatality models. One of the nice features of this study is that an iterative sampling technique is used to construct various training and test datasets for the purpose of better performance comparison Statistical comparison considering the variation of model Performances is conducted. The results show that the PNN-based fatality model outperforms the BLR-based model. The models developed in this study that allow us to predict the pedestrian fatality would be useful tools for supporting the derivation of various safety Policies and technologies to enhance Pedestrian safety.

Bayesian analysis of latent factor regression model (내재된 인자회귀모형의 베이지안 분석법)

  • Kyung, Minjung
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.4
    • /
    • pp.365-377
    • /
    • 2020
  • We discuss latent factor regression when constructing a common structure inherent among explanatory variables to solve multicollinearity and use them as regressors to construct a linear model of a response variable. Bayesian estimation with LASSO prior of a large penalty parameter to construct a significant factor loading matrix of intrinsic interests among infinite latent structures. The estimated factor loading matrix with estimated other parameters can be inversely transformed into linear parameters of each explanatory variable and used as prediction models for new observations. We apply the proposed method to Product Service Management data of HBAT and observe that the proposed method constructs the same factors of general common factor analysis for the fixed number of factors. The calculated MSE of predicted values of Bayesian latent factor regression model is also smaller than the common factor regression model.

Small Area Estimation Using Bayesian Auto Poisson Model with Spatial Statistics (공간통계량을 활용한 베이지안 자기 포아송 모형을 이용한 소지역 통계)

  • Lee, Sang-Eun
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.3
    • /
    • pp.421-430
    • /
    • 2006
  • In sample survey sample designs are performed by geographically-based domain such as countries, states and metropolitan areas. However mostly statistics of interests are smaller domain than sample designed domain. Then sample sizes are typically small or even zero within the domain of interest. Shin and Lee(2003) mentioned Spatial Autoregressive(SAR) model in small area estimation model-based method and show the effectiveness by MSE. In this study, Bayesian Auto-Poisson Model is applied in model-based small area estimation method and compare the results with SAR model using MSE ME and bias check diagnosis using regression line. In this paper Survey of Disability, Aging and Cares(SDAC) data are used for simulation studies.