• Title/Summary/Keyword: tree-based models

Search Result 437, Processing Time 0.029 seconds

Cluster Based Fuzzy Model Tree Using Node Information (상호 노드 정보를 이용한 클러스터 기반 퍼지 모델트리)

  • Park, Jin-Il;Lee, Dae-Jong;Kim, Yong-Sam;Cho, Young-Im;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.1
    • /
    • pp.41-47
    • /
    • 2008
  • Cluster based fuzzy model tree has certain drawbacks to decrease performance of testinB data when over-fitting of training data exists. To reduce the sensitivity of performance due to over-fitting problem, we proposed a modified cluster based fuzzy model tree with node information. To construct model tree, cluster centers are calculated by fuzzy clustering method using all input and output attributes in advance. And then, linear models are constructed at internal nodes with fuzzy membership values between centers and input attributes. In the prediction step, membership values are calculated by using fuzzy distance between input attributes and all centers that passing the nodes from root to leaf nodes. Finally, data prediction is performed by the weighted average method with the linear models and fuzzy membership values. To show the effectiveness of the proposed method, we have applied our method to various dataset. Under various experiments, our proposed method shows better performance than conventional cluster based fuzzy model tree.

Comparison of Classification Models for Sequential Flight Test Results (단계별 비행훈련 성패 예측 모형의 성능 비교 연구)

  • Sohn, So-Young;Cho, Yong-Kwan;Choi, Sung-Ok;Kim, Young-Joun
    • Journal of the Ergonomics Society of Korea
    • /
    • v.21 no.1
    • /
    • pp.1-14
    • /
    • 2002
  • The main purpose of this paper is to present selection criteria for ROK Airforce pilot training candidates in order to save costs involved in sequential pilot training. We use classification models such Decision Tree, Logistic Regression and Neural Network based on aptitude test results of 288 ROK Air Force applicants in 1994-1996. Different models are compared in terms of classification accuracy, ROC and Lift-value. Neural network is evaluated as the best model for each sequential flight test result while Logistic regression model outperforms the rest of them for discriminating the last flight test result. Therefore we suggest a pilot selection criterion based on this logistic regression. Overall. we find that the factors such as Attention Sharing, Speed Tracking, Machine Comprehension and Instrument Reading Ability having significant effects on the flight results. We expect that the use of our criteria can increase the effectiveness of flight resources.

Applied linear and nonlinear statistical models for evaluating strength of Geopolymer concrete

  • Prem, Prabhat Ranjan;Thirumalaiselvi, A.;Verma, Mohit
    • Computers and Concrete
    • /
    • v.24 no.1
    • /
    • pp.7-17
    • /
    • 2019
  • The complex phenomenon of the bond formation in geopolymer is not well understood and therefore, difficult to model. This paper present applied statistical models for evaluating the compressive strength of geopolymer. The applied statistical models studied are divided into three different categories - linear regression [least absolute shrinkage and selection operator (LASSO) and elastic net], tree regression [decision and bagging tree] and kernel methods (support vector regression (SVR), kernel ridge regression (KRR), Gaussian process regression (GPR), relevance vector machine (RVM)]. The performance of the methods is compared in terms of error indices, computational effort, convergence and residuals. Based on the present study, kernel based methods (GPR and KRR) are recommended for evaluating compressive strength of Geopolymer concrete.

Comparison of tree-based ensemble models for regression

  • Park, Sangho;Kim, Chanmin
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.561-589
    • /
    • 2022
  • When multiple classifications and regression trees are combined, tree-based ensemble models, such as random forest (RF) and Bayesian additive regression trees (BART), are produced. We compare the model structures and performances of various ensemble models for regression settings in this study. RF learns bootstrapped samples and selects a splitting variable from predictors gathered at each node. The BART model is specified as the sum of trees and is calculated using the Bayesian backfitting algorithm. Throughout the extensive simulation studies, the strengths and drawbacks of the two methods in the presence of missing data, high-dimensional data, or highly correlated data are investigated. In the presence of missing data, BART performs well in general, whereas RF provides adequate coverage. The BART outperforms in high dimensional, highly correlated data. However, in all of the scenarios considered, the RF has a shorter computation time. The performance of the two methods is also compared using two real data sets that represent the aforementioned situations, and the same conclusion is reached.

Real Option Decision Tree Models for R&D Project Investment (R&D 프로젝트 투자 의사결정을 위한 실물옵션 의사결정나무 모델)

  • Choi, Gyung-Hyun;Cho, Dae-Myeong;Joung, Young-Ki
    • IE interfaces
    • /
    • v.24 no.4
    • /
    • pp.408-419
    • /
    • 2011
  • R&D is a foundation for new business chance and productivity improvement leading to enormous expense and a long-term multi-step process. During the R&D process, decision-makers are confused due to the various future uncertainties that influence economic and technical success of the R&D projects. For these reasons, several decision-making models for R&D project investment have been suggested; they are based on traditional methods such as Discounted Cash Flow (DCF), Decision Tree Analysis (DTA) and Real Option Analysis (ROA) or some fusion forms of the traditional methods. However, almost of the models have constraints in practical use owing to limits on application, procedural complexity and incomplete reflection of the uncertainties. In this study, to make the constraints minimized, we propose a new model named Real Option Decision Tree Model which is a conceptual combination form of ROA and DTA. With this model, it is possible for the decision-makers to simulate the project value applying the uncertainties onto the decision making nodes.

Real Time Current Prediction with Recurrent Neural Networks and Model Tree

  • Cini, S.;Deo, Makarand Chintamani
    • International Journal of Ocean System Engineering
    • /
    • v.3 no.3
    • /
    • pp.116-130
    • /
    • 2013
  • The prediction of ocean currents in real time over the warning times of a few hours or days is required in planning many operation-related activities in the ocean. Traditionally this is done through numerical models which are targeted toward producing spatially distributed information. This paper discusses a complementary method to do so when site-specific predictions are desired. It is based on the use of a recurrent type of neural network as well as the statistical tool of model tree. The measurements made at a site in Indian Ocean over a period of 4 years were used. The predictions were made over 72 time steps in advance. The models developed were found to be fairly accurate in terms of the selected error statistics. Among the two modeling techniques the model tree performed better showing the necessity of using distributed models for different sub-domains of data rather than a unique one over the entire input domain. Typically such predictions were associated with average errors of less than 2.0 cm/s. Although the prediction accuracy declined over longer intervals, it was still very satisfactory in terms of theselected error criteria. Similarly prediction of extreme values matched with that of the rest of predictions. Unlike past studies both east-west and north-south current components were predicted fairly well.

A New Quantification Method for Multi-Unit Probabilistic Safety Assessment (다수기 PSA 수행을 위한 새로운 정량화 방법)

  • Park, Seong Kyu;Jung, Woo Sik
    • Journal of the Korean Society of Safety
    • /
    • v.35 no.1
    • /
    • pp.97-106
    • /
    • 2020
  • The objective of this paper is to suggest a new quantification method for multi-unit probabilistic safety assessment (PSA) that removes the overestimation error caused by the existing delete-term approximation (DTA) based quantification method. So far, for the actual plant PSA model quantification, a fault tree with negates have been solved by the DTA method. It is well known that the DTA method induces overestimated core damage frequency (CDF) of nuclear power plant (NPP). If a PSA fault tree has negates and non-rare events, the overestimation in CDF drastically increases. Since multi-unit seismic PSA model has plant level negates and many non-rare events in the fault tree, it should be very carefully quantified in order to avoid CDF overestimation. Multi-unit PSA fault tree has normal gates and negates that represent each NPP status. The NPP status means core damage or non-core damage state of individual NPPs. The non-core damage state of a NPP is modeled in the fault tree by using a negate (a NOT gate). Authors reviewed and compared (1) quantification methods that generate exact or approximate Boolean solutions from a fault tree, (2) DTA method generating approximate Boolean solution by solving negates in a fault tree, and (3) probability calculation methods from the Boolean solutions generated by exact quantification methods or DTA method. Based on the review and comparison, a new intersection removal by probability (IRBP) method is suggested in this study for the multi-unit PSA. If the IRBP method is adopted, multi-unit PSA fault tree can be quantified without the overestimation error that is caused by the direct application of DTA method. That is, the extremely overestimated CDF can be avoided and accurate CDF can be calculated by using the IRBP method. The accuracy of the IRBP method was validated by simple multi-unit PSA models. The necessity of the IRBP method was demonstrated by the actual plant multi-unit seismic PSA models.

Comparison of the Prediction Model of Adolescents' Suicide Attempt Using Logistic Regression and Decision Tree: Secondary Data Analysis of the 2019 Youth Health Risk Behavior Web-Based Survey (로지스틱 회귀모형과 의사결정 나무모형을 활용한 청소년 자살 시도 예측모형 비교: 2019 청소년 건강행태 온라인조사를 이용한 2차 자료분석)

  • Lee, Yoonju;Kim, Heejin;Lee, Yesul;Jeong, Hyesun
    • Journal of Korean Academy of Nursing
    • /
    • v.51 no.1
    • /
    • pp.40-53
    • /
    • 2021
  • Purpose: The purpose of this study was to develop and compare the prediction model for suicide attempts by Korean adolescents using logistic regression and decision tree analysis. Methods: This study utilized secondary data drawn from the 2019 Youth Health Risk Behavior web-based survey. A total of 20 items were selected as the explanatory variables (5 of sociodemographic characteristics, 10 of health-related behaviors, and 5 of psychosocial characteristics). For data analysis, descriptive statistics and logistic regression with complex samples and decision tree analysis were performed using IBM SPSS ver. 25.0 and Stata ver. 16.0. Results: A total of 1,731 participants (3.0%) out of 57,303 responded that they had attempted suicide. The most significant predictors of suicide attempts as determined using the logistic regression model were experience of sadness and hopelessness, substance abuse, and violent victimization. Girls who have experience of sadness and hopelessness, and experience of substance abuse have been identified as the most vulnerable group in suicide attempts in the decision tree model. Conclusion: Experiences of sadness and hopelessness, experiences of substance abuse, and experiences of violent victimization are the common major predictors of suicide attempts in both logistic regression and decision tree models, and the predict rates of both models were similar. We suggest to provide programs considering combination of high-risk predictors for adolescents to prevent suicide attempt.

Risk analysis of offshore terminals in the Caspian Sea

  • Mokhtari, Kambiz;Amanee, Jamshid
    • Ocean Systems Engineering
    • /
    • v.9 no.3
    • /
    • pp.261-285
    • /
    • 2019
  • Nowadays in offshore industry there are emerging hazards with vague property such as act of terrorism, act of war, unforeseen natural disasters such as tsunami, etc. Therefore industry professionals such as offshore energy insurers, safety engineers and risk managers in order to determine the failure rates and frequencies for the potential hazards where there is no data available, they need to use an appropriate method to overcome this difficulty. Furthermore in conventional risk based analysis models such as when using a fault tree analysis, hazards with vague properties are normally waived and ignored. In other word in previous situations only a traditional probability based fault tree analysis could be implemented. To overcome this shortcoming fuzzy set theory is applied to fault tree analysis to combine the known and unknown data in which the pre-combined result will be determined under a fuzzy environment. This has been fulfilled by integration of a generic bow-tie based risk analysis model into the risk assessment phase of the Risk Management (RM) cycles as a backbone of the phase. For this reason Fault Tree Analysis (FTA) and Event Tree Analysis (ETA) are used to analyse one of the significant risk factors associated in offshore terminals. This process will eventually help the insurers and risk managers in marine and offshore industries to investigate the potential hazards more in detail if there is vagueness. For this purpose a case study of offshore terminal while coinciding with the nature of the Caspian Sea was decided to be examined.

Development of Coil Breakage Prediction Model In Cold Rolling Mill

  • Park, Yeong-Bok;Hwang, Hwa-Won
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1343-1346
    • /
    • 2005
  • In the cold rolling mill, coil breakage that generated in rolling process makes the various types of troubles such as the degradation of productivity and the damage of equipment. Recent researches were done by the mechanical analysis such as the analysis of roll chattering or strip inclining and the prevention of breakage that detects the crack of coil. But they could cover some kind of breakages. The prediction of Coil breakage was very complicated and occurred rarely. We propose to build effective prediction modes for coil breakage in rolling process, based on data mining model. We proposed three prediction models for coil breakage: (1) decision tree based model, (2) regression based model and (3) neural network based model. To reduce model parameters, we selected important variables related to the occurrence of coil breakage from the attributes of coil setup by using the methods such as decision tree, variable selection and the choice of domain experts. We developed these prediction models and chose the best model among them using SEMMA process that proposed in SAS E-miner environment. We estimated model accuracy by scoring the prediction model with the posterior probability. We also have developed a software tool to analyze the data and generate the proposed prediction models either automatically and in a user-driven manner. It also has an effective visualization feature that is based on PCA (Principle Component Analysis).

  • PDF