• Title/Summary/Keyword: 공선성

Search Result 158, Processing Time 0.025 seconds

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

Predictions of VO2max Using Metabolical Responses in Submaximal Exercise and 1,200 m Running for Male, and the Validity of These Prediction Models (성인 남성의 최대하 운동시 대사반응 및 1,200 m 달리기 기록을 이용한 최대산소섭취량 추정식 개발 및 타당도)

  • Im, J.H.;Jeon, Y.J.;Jang, H.K.;Kim, H.J.;Kim, K.H.;Lee, B.K.
    • Exercise Science
    • /
    • v.21 no.2
    • /
    • pp.231-242
    • /
    • 2012
  • The purpose of this study was to develop the prediction model of VO2max using submaximal metabolic responses from the Bruce protocol, HR responses at several stages and 1,200 m running record, and to compare and analyse the validity of these prediction models. The subjects were consisted of 255 male(133 male for 1,200 m running). They were participated maximal exercise testing with Bruce protocol, and the metabolic responses were measured in the end of the first(3 minute), second stage(6 minute), and 1,200 m running record. Measurement items were VO2(㎖/kg/min), VCO2(㎖/kg/min), VE(L/min), HR(bpm) of 3 and 6 minute, time to HR 150 bpm and 170 bpm, HR difference between Bruce protocol 6 and 3 minute, 1,200 m running record. Analyzing with all variables using enter method, the multiple R of total variable model was 0.642(p<.01), SEE was 4.38 ㎖/kg/min, CV was 10.8 %, but multicolinearity was appeared. The multiple R of 3 minutes model 1 and model 2 were 0.341 and 0.461, SEE was 6.05 and 5.72 ㎖/kg/min, CV was 14.9 and 14.1%, and multicolinearity did not appeared. The multiple R of 6 minutes model 1 and model 2 were 0.350 and 0.456, SEE was 6.03 and 5.74 ㎖/kg/min, CV was 14.9 and 14.2%, and multicolinearity did not appeared. The R of HR 170 and HR 170 model were 0.151 and 0.154, SEE were 6.36~6.37 ㎖/kg/min, CV were 15.7%. The R of 1,200 m running model was 0.444, SEE was 4.82 ㎖/kg/min, CV were 11.9%. In conclusion, with considering usefulness and convenience through the validity of these prediction models, the prediction model of VO2max recommended 6 and 3 minute model, and the validity of HR model and 1,200 m running model were moderately low.

A New Algorithm for the Interpretation of Joint Orientation Using Multistage Convergent Photographing Technique (수렴다중촬영기법을 이용한 새로운 절리방향 해석방법)

  • 김재동;김종훈
    • Tunnel and Underground Space
    • /
    • v.13 no.6
    • /
    • pp.486-494
    • /
    • 2003
  • When the orientations of joints are measured on a rock exposure, there are frequent cases that are difficult to approach by the surveyor to the target joints or to set up scanlines on the slope. In this study, to complement such limit and weak points, a new algorithm was developed to interpret joint orientation from analyzing the images of rock slope. As a method of arranging the multiple images of a rock slope, the multistage convergent photographing system was introduced to overcome the limitation of photographing direction which existing method such as parallel stereophotogrammetric system has and to cover the range of image measurement, which is the overlapping area between the image pair, to a maximum extent. To determine camera parameters in the perspective projection equation that are the main elements of the analysis method, a new method was developed introducing three ground control points and single ground guide point. This method could be considered to be very simple compared with other existing methods using a number of ground control points and complicated analysis process. So the global coordinates of a specific point on a rock slope could be analyzed with this new method. The orientation of a joint could be calculated using the normal vector of the joint surface which can be derived from the global coordinates of several points on the joint surface analyzed from the images.

Impact analysis of Industrial-University cooperation adherency degree and cooperation degree configuration variable on satisfaction (산학협력 밀착도, 협력도 구성변수가 만족도에 미치는 영향 분석)

  • Kim, Young-Bu
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.9
    • /
    • pp.359-368
    • /
    • 2016
  • In the 21st century, the Korean university education system is focused on innovation and change, including cooperation between industry and universities. It should be a goal to foster an industry-university ecosystem through interactions between universities and industry. Therefore, it is important to measure their relationships and to find advisable ways to measure the final results of industry-university cooperation. This paper sets out the achievements in cooperation and the satisfaction from such enterprises and measures mutual relationships influencing satisfaction from industry-university cooperation as to adherence and cooperation. Therefore, this research focuses on regression equation analysis in order to analyze the influence from satisfaction with industry-university cooperation based on factors in the relations between industry and universities. Also, as we examined the multicollinearity problem, before analyzing multiple regression, the multicollinearity problem appeared to be relatively irrelevant. In particular, the satisfaction variable, which can also be set as a subordinate variable, was in this research constructed as a high-dimensional subordinate variable composed of five individual variables. We then analyzed how the adherence construct factor and degree of cooperation construct factor influences the respective and subordinate satisfaction variables. As a result, the degree of realization of local customized programs was shown by the most significant variables. The biggest factor influencing satisfaction with industry-university cooperation proves the degree of realization for appropriate programs under local conditions, such as education, research, and technique guidance.

Apartment Price Prediction Using Deep Learning and Machine Learning (딥러닝과 머신러닝을 이용한 아파트 실거래가 예측)

  • Hakhyun Kim;Hwankyu Yoo;Hayoung Oh
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.59-76
    • /
    • 2023
  • Since the COVID-19 era, the rise in apartment prices has been unconventional. In this uncertain real estate market, price prediction research is very important. In this paper, a model is created to predict the actual transaction price of future apartments after building a vast data set of 870,000 from 2015 to 2020 through data collection and crawling on various real estate sites and collecting as many variables as possible. This study first solved the multicollinearity problem by removing and combining variables. After that, a total of five variable selection algorithms were used to extract meaningful independent variables, such as Forward Selection, Backward Elimination, Stepwise Selection, L1 Regulation, and Principal Component Analysis(PCA). In addition, a total of four machine learning and deep learning algorithms were used for deep neural network(DNN), XGBoost, CatBoost, and Linear Regression to learn the model after hyperparameter optimization and compare predictive power between models. In the additional experiment, the experiment was conducted while changing the number of nodes and layers of the DNN to find the most appropriate number of nodes and layers. In conclusion, as a model with the best performance, the actual transaction price of apartments in 2021 was predicted and compared with the actual data in 2021. Through this, I am confident that machine learning and deep learning will help investors make the right decisions when purchasing homes in various economic situations.

A Way of Securing the Access By Using PCA (주성분분석(PCA)을 이용한 출입인원관리에 대한 보안성 확보 방안)

  • Kim, Min-Su;Lee, Dong-Hwi
    • Convergence Security Journal
    • /
    • v.12 no.3
    • /
    • pp.3-10
    • /
    • 2012
  • This study aimed at making a way of securing the access by using PCA. We got our result through using Box-Plot and PCA with the access data of the area of security level A~E at K(IPS)center. In order to perform PCA, We confirmed the extracted value of commonality has no problem in performing PCA because VIF is below 2.902. Based on this result, We classified people into Green-list, Blue-list, Red-list, and Black-list in a standard of security level with 1.453, as the eigen value of 1 main element, 1.283, as eigen value of 2 main elementm, 1.142, as the eigen value of 3 main element.

Spatial Hedonic Modeling using Geographically Weighted LASSO Model (GWL을 적용한 공간 헤도닉 모델링)

  • Jin, Chanwoo;Lee, Gunhak
    • Journal of the Korean Geographical Society
    • /
    • v.49 no.6
    • /
    • pp.917-934
    • /
    • 2014
  • Geographically weighted regression(GWR) model has been widely used to estimate spatially heterogeneous real estate prices. The GWR model, however, has some limitations of the selection of different price determinants over space and the restricted number of observations for local estimation. Alternatively, the geographically weighted LASSO(GWL) model has been recently introduced and received a growing interest. In this paper, we attempt to explore various local price determinants for the real estate by utilizing the GWL and its applicability to forecasting the real estate price. To do this, we developed the three hedonic models of OLS, GWR, and GWL focusing on the sales price of apartments in Seoul and compared those models in terms of model fit, prediction, and multicollinearity. As a result, local models appeared to be better than the global OLS on the whole, and in particular, the GWL appeared to be more explanatory and predictable than other models. Moreover, the GWL enabled to provide spatially different sets of price determinants which no multicollinearity exists. The GWL helps select the significant sets of independent variables from a high dimensional dataset, and hence will be a useful technique for large and complex spatial big data.

  • PDF

Development of Ship Valuation Model by Neural Network (신경망기법을 활용한 선박 가치평가 모델 개발)

  • Kim, Donggyun;Choi, Jung-Suk
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.1
    • /
    • pp.13-21
    • /
    • 2021
  • The purpose of this study is to develop the ship valuation model by utilizing the neural network model. The target of the valuation was secondhand VLCC. The variables were set as major factors inducing changes in the value of ship through prior research, and the corresponding data were collected on a monthly basis from January 2000 to August 2020. To determine the stability of subsequent variables, a multi-collinearity test was carried out and finally the research structure was designed by selecting six independent variables and one dependent variable. Based on this structure, a total of nine simulation models were designed using linear regression, neural network regression, and random forest algorithm. In addition, the accuracy of the evaluation results are improved through comparative verification between each model. As a result of the evaluation, it was found that the most accurate when the neural network regression model, which consist of a hidden layer composed of two layers, was simulated through comparison with actual VLCC values. The possible implications of this study first, creative research in terms of applying neural network model to ship valuation; this deviates from the existing formalized evaluation techniques. Second, the objectivity of research results was enhanced from a dynamic perspective by analyzing and predicting the factors of changes in the shipping. market.

청년창업자의 경영성과에 영향을 미치는 요인

  • Park, Mi-Ryeo;Yang, Yeong-Seok;Kim, Myeong-Suk
    • 한국벤처창업학회:학술대회논문집
    • /
    • 2017.04a
    • /
    • pp.44-44
    • /
    • 2017
  • 본 연구는 청년창업자를 대상으로 청년창업자의 역량과 경영성과 사이에는 어떠한 관계가 있는지에 대해 살펴보고자 하였다. 연구의 자료는 한국노동연구원의 '청년패널조사(2015)' 9차년도 자료를 사용하였다. 본 연구의 표본은 비임금 근로자 중 학력은 전문대졸이상 이며, 가업을 물려받은 경우를 제외한 창업을 한 청년 182명을 최종 분석대상자로 선정하였다. 조사대상자의 일반적 특성을 알아보기 위해 빈도, 백분율, 평균, 표준편차를 산출하였고, 변인들 간의 다중공선성을 살펴보기 위해 상관관계분석을 실시하였다. 또 청년창업자의 경영성과에 미치는 영향요인을 살펴보기 위해 위계적 회귀분석을 실시하였다. 본 연구에 사용된 자료는 IBM SPSS Statistic 22.0을 이용하여 분석하였다. 본 연구는 청년창업자의 경영성과에 영향을 미치는 요인을 분석하기 위해 청년창업가의 역량으로 창업준비역량, 기업가역량, 관리역량 등 결정요인을 도출하고, 이들 요인과 경영성과 간의 가설을 설정하고, 이를 분석하고자 하였다. 본 연구의 결과는 다음과 같다. 청년창업자의 경영성과에 영향을 미친 요인은 교육수준 대비 일수준이 낮을수록, 전공이 일치하지 않을수록, 직무만족이 높을수록, 창업총자본금이 많을수록 경영성과가 높은 것으로 나타났다.

  • PDF

Tributary Flood Forecasting Using Statistical Analysis Method (통계적 모형을 이용한 지천 홍수예측)

  • Sung, Ji-Youn;Heo, Jun-Haeng
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2009.05a
    • /
    • pp.1524-1527
    • /
    • 2009
  • 본 연구는 주요지천 홍수예측에 적용된 통계적 모형을 개선하여 예측 결과의 정확성 향상을 도모하는 데 목적이 있다. 중랑천, 탄천, 왕숙천 등 한강수계 주요 지천은 홍수예보 지점으로 유역면적이 작고 도달 시간이 짧아 기존의 대하천 홍수예보에 이용되고 있는 수문학적 홍수예측 모형을 적용하기에는 한계가 있다. 이러한 문제점을 해결하기 위해 주요 지천 홍수예측에 통계적 모형인 다중선형 회귀모형을 이용하는 방법이 제안되어 활용되었다. 본 연구에서는 지천홍수예측에 기 적용된 다중선형 회귀 모형의 다중공선성 문제를 해결하기 위해 독립변수를 조정하고, 10분 단위 관측 자료를 활용한 예측 결과를 얻기 위해 매개변수를 재산정하였다. 그 결과 기존 모형에 비해 적은 수의 독립변수와 재 산정된 매개변수를 이용한 통계적 모형으로 예측 수위의 오차를 줄일 수 있었다.

  • PDF