• Title/Summary/Keyword: 베이지안 회귀분석

Search Result 73, Processing Time 0.024 seconds

Imputation for Binary or Ordered Categorical Traits Based on the Bayesian Threshold Model (베이지안 분계점 모형에 의한 순서 범주형 변수의 대체)

  • Lee Seung-Chun
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.597-606
    • /
    • 2005
  • The nonresponse in sample survey causes a problem when it comes time to analyze dataset in public-use files where the user has only complete-data methods available and has limited information about the reasons for nonresponse. Recently imputation for nonresponse is becoming a standard approach for handling nonresponse and various imputation methods have been devised . However, most imputation methods concern with continuous traits while many interesting features are measured by binary or ordered categorical scales in sample survey. In this note. an imputation method for ignorable nonresponse in binary or ordered categorical traits is considered.

Bayesian Network Analysis for the Dynamic Prediction of Financial Performance Using Corporate Social Responsibility Activities (베이지안 네트워크를 이용한 기업의 사회적 책임활동과 재무성과)

  • Sun, Eun-Jung
    • Management & Information Systems Review
    • /
    • v.34 no.5
    • /
    • pp.71-92
    • /
    • 2015
  • This study analyzes the impact of Corporate Social Responsibility (CSR) activities on financial performances using Bayesian Network. The research tries to overcome the issues of the uniform assumption of a linear function between financial performance and CSR activities in multiple regression analysis widely used in previous studies. It is required to infer a causal relationship between activities of CSR which have an impact on the financial performances. Identifying the relationship would empower the firms to improve their financial performance by informing the decision makers about the different CSR activities that influence the financial performance of the firms. This research proposes General Bayesian Network (GBN) and presents Markov Blanket induced from GBN. It is empirically demonstrated that all the proposals presented in this study are statistically significant by the results of the research conducted by Korean Economic Justice Institute (KEJI) under Citizen's Coalition for Economic Justice (CCEJ) which investigated approximately 200 companies in Korea based on Korean Economic Justice Institute Index (KEJI index) from 2005 to 2011. The Bayesian Network to effectively infer the properties affecting financial performances through the probabilistic causal relationship. Moreover, I found that there is a causal relationship among CSR activities variable; that is Environment protection is related to Customer protection, Employee satisfaction, and firm size; Soundness is related to Total CSR Evaluation Score, Debt-Assets Ratio. Though the what-if analysis, I suggest to the sensitive factor among the explanatory variables.

  • PDF

A Fast Bayesian Detection of Change Points Long-Memory Processes (장기억 과정에서 빠른 베이지안 변화점검출)

  • Kim, Joo-Won;Cho, Sin-Sup;Yeo, In-Kwon
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.735-744
    • /
    • 2009
  • In this paper, we introduce a fast approach for Bayesian detection of change points in long-memory processes. Since a heavy computation is needed to evaluate the likelihood function of long-memory processes, a method for simplifying the computational process is required to efficiently implement a Bayesian inference. Instead of estimating the parameter, we consider selecting a element from the set of possible parameters obtained by categorizing the parameter space. This approach simplifies the detection algorithm and reduces the computational time to detect change points. Since the parameter space is (0, 0.5), there is no big difference between the result of parameter estimation and selection under a proper fractionation of the parameter space. The analysis of Nile river data showed the validation of the proposed method.

A comparison and prediction of total fertility rate using parametric, non-parametric, and Bayesian model (모수, 비모수, 베이지안 출산율 모형을 활용한 합계출산율 예측과 비교)

  • Oh, Jinho
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.677-692
    • /
    • 2018
  • The total fertility rate of Korea was 1.05 in 2017, showing a return to the 1.08 level in the year 2005. 1.05 is a very low fertility level that is far from replacement level fertility or safety zone 1.5. The number may indicate a low fertility trap. It is therefore important to predict fertility than at any other time. In the meantime, we have predicted the age-specific fertility rate and total fertility rate by various statistical methods. When the data trend is disconnected or fluctuating, it applied a nonparametric method applying the smoothness and weight. In addition, the Bayesian method of using the pre-distribution of fertility rates in advanced countries with reference to the three-stage transition phenomenon have been applied. This paper examines which method is reasonable in terms of precision and feasibility by applying estimation, forecasting, and comparing the results of the recent variability of the Korean fertility rate with parametric, non-parametric and Bayesian methods. The results of the analysis showed that the total fertility rate was in the order of KOSTAT's total fertility rate, Bayesian, parametric and non-parametric method outcomes. Given the level of TFR 1.05 in 2017, the predicted total fertility rate derived from the parametric and nonparametric models is most reasonable. In addition, if a fertility rate data is highly complete and a quality is good, the parametric model approach is superior to other methods in terms of parameter estimation, calculation efficiency and goodness-of-fit.

A distributed IDS design on global network (베이지안 네트워크를 이용한 분산 IDS 설계)

  • Kim, Do-Jin;Lee, Jung-Hyun;Hwang, Suk-Hee;Hwang, Jun-Won;Lee, Chang-Hun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.04b
    • /
    • pp.967-970
    • /
    • 2002
  • 광역 네트워크상에서 침입탐지는 백본망에서의 기가비트를 처리할 수 능력이 시스템에 필요로 하고 있다. 하지만 많은 loss 와 시스템 미치는 부하로 시스템 자체에 큰 영향을 미친다. 따라서 본 논문에서는 이러한 단점을 보완하기 위하여 백본망에 있는 각 local network 에 분산 에이전트를 설치하고, 여기에서 발생한 데이터를 다중회귀분석의 회귀계수를 메인 시스템에서 보내 처리함으로써 전체 및 각 Local 네트워크에 대한 밸런스를 조절하고, 감시하는 기능을 제공하는 시스템의 설계방법을 제시한다.

  • PDF

Extraction of Hazardous Freeway Sections Using GPS-Based Probe Vehicle Speed Data (GPS 프로브 차량 속도자료를 이용한 고속도로 사고 위험구간 추출기법)

  • Park, Jae-Hong;Oh, Cheol;Kim, Tae-Hyung;Joo, Shin-Hye
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.9 no.3
    • /
    • pp.73-84
    • /
    • 2010
  • This study presents a novel method to identify hazardous segments of freeway using global positioning system(GPS) based probe vehicle data. A variety of candidate contributing factors leading to higher potential of accident occurrence were extracted from the probe vehicle dataset. The research problem was defined as a classification problem, then a well-known classifier, bayesian neural network was adopted to solve the problem. A binary logistic regression technique was also used for selecting salient input variables. Test results showed that the proposed method is promising in extracting hazardous freeway sections. The outcome of this study will be effectively used for evaluating the safety of freeway sections and deriving countermeasures to prevent accidents.

A Study of Prediction on Company's Growth with R and Analysis Algoritnm (R과 분석 알고리즘을 활용한 기업의 성장성 예측에 관한 연구)

  • Kang, Hui-Seok;Kim, Kyung-Su;Ryu, Ji-Seung;Lee, Ga-Yeon;Lee, Min-Jung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.428-431
    • /
    • 2017
  • 기업의 성장성과 기업 주식가치를 매출, 매출원가, 영업이익율 등의 정형데이터와 경제, 경영관련 뉴스 등 비정형 데이터를 토대로 다양한 알고리즘을 활용해 분석하고, 그 결과의 유의성을 검증한다. 주성분회귀분석, 인공신경망, 나이브 베이지안 분류자, 긍/부정 사전분석 모델을 통해 분석된 결과를 검토하여 각 분석모델 별 성능을 확인하고, 기업 성장성 예측을 위해 활용 가능한 모델과 필요한 데이터를 제시한다.

Identification of Uncertainty in Fitting Rating Curve with Bayesian Regression (베이지안 회귀분석을 이용한 수위-유량 관계곡선의 불확실성 분석)

  • Kim, Sang-Ug;Lee, Kil-Seong
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.9
    • /
    • pp.943-958
    • /
    • 2008
  • This study employs Bayesian regression analysis for fitting discharge rating curves. The parameter estimates using the Bayesian regression analysis were compared to ordinary least square method using the t-distribution. In these comparisons, the mean values from the t-distribution and the Bayesian regression are not significantly different. However, the difference between upper and lower limits are remarkably reduced with the Bayesian regression. Therefore, from the point of view of uncertainty analysis, the Bayesian regression is more attractive than the conventional method based on a t-distribution because the data size at the site of interest is typically insufficient to estimate the parameters in rating curve. The merits and demerits of the two types of estimation methods are analyzed through the statistical simulation considering heteroscedasticity. The validation of the Bayesian regression is also performed using real stage-discharge data which were observed at 5 gauges on the Anyangcheon basin. Because the true parameters at 5 gauges are unknown, the quantitative accuracy of the Bayesian regression can not be assessed. However, it can be suggested that the uncertainty in rating curves at 5 gauges be reduced by Bayesian regression.

Performance of a Bayesian Design Compared to Some Optimal Designs for Linear Calibration (선형 캘리브레이션에서 베이지안 실험계획과 기존의 최적실험계획과의 효과비교)

  • 김성철
    • The Korean Journal of Applied Statistics
    • /
    • v.10 no.1
    • /
    • pp.69-84
    • /
    • 1997
  • We consider a linear calibration problem, $y_i = $$\alpha + \beta (x_i - x_0) + \epsilon_i$, $i=1, 2, {\cdot}{\cdot},n$ $y_f = \alpha + \beta (x_f - x_0) + \epsilon, $ where we observe $(x_i, y_i)$'s for the controlled calibration experiments and later we make inference about $x_f$ from a new observation $y_f$. The objective of the calibration design problem is to find the optimal design $x = (x_i, \cdots, x_n$ that gives the best estimates for $x_f$. We compare Kim(1989)'s Bayesian design which minimizes the expected value of the posterior variance of $x_f$ and some optimal designs from literature. Kim suggested the Bayesian optimal design based on the analysis of the characteristics of the expected loss function and numerical must be equal to the prior mean and that the sum of squares be as large as possible. The designs to be compared are (1) Buonaccorsi(1986)'s AV optimal design that minimizes the average asymptotic variance of the classical estimators, (2) D-optimal and A-optimal design for the linear regression model that optimize some functions of $M(x) = \sum x_i x_i'$, and (3) Hunter & Lamboy (1981)'s reference design from their paper. In order to compare the designs which are optimal in some sense, we consider two criteria. First, we compare them by the expected posterior variance criterion and secondly, we perform the Monte Carlo simulation to obtain the HPD intervals and compare the lengths of them. If the prior mean of $x_f$ is at the center of the finite design interval, then the Bayesian, AV optimal, D-optimal and A-optimal designs are indentical and they are equally weighted end-point design. However if the prior mean is not at the center, then they are not expected to be identical.In this case, we demonstrate that the almost Bayesian-optimal design was slightly better than the approximate AV optimal design. We also investigate the effects of the prior variance of the parameters and solution for the case when the number of experiments is odd.

  • PDF

The probabilistic estimation of inundation region using a multiple logistic regression analysis (다중 Logistic 회귀분석을 통한 침수지역의 확률적 도출)

  • Jung, Minkyu;Kim, Jin-Guk;Uranchimeg, Sumiya;Kwon, Hyun-Han
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.2
    • /
    • pp.121-129
    • /
    • 2020
  • The increase of impervious surface and development along the river due to urbanization not only causes an increase in the number of associated flood risk factors but also exacerbates flood damage, leading to difficulties in flood management. Flood control measures should be prioritized based on various geographical information in urban areas. In this study, a probabilistic flood hazard assessment was applied to flood-prone areas near an urban river. Flood hazard maps were alternatively considered and used to describe the expected inundation areas for a given set of predictors such as elevation, slope, runoff curve number, and distance to river. This study proposes a Bayesian logistic regression-based flood risk model that aims to provide a probabilistic risk metric such as population-at-risk (PAR). Finally, the logistic regression model demonstrates the probabilistic flood hazard maps for the entire area.