• Title/Summary/Keyword: Predictive decision tree

Search Result 115, Processing Time 0.021 seconds

A Predictive Model of Turnover among Nurses in a Tertiary Hospital: Decision Tree Analysis (의사결정나무 분석기법을 이용한 상급종합병원 간호사의 이직 예측모형 구축)

  • Kang, Kyung Ok;Han, Nara;Jeong, Jeong A;Choi, Young Eun;Park Jin Kyung;Jeong, Seok Hee
    • Journal of East-West Nursing Research
    • /
    • v.29 no.1
    • /
    • pp.68-77
    • /
    • 2023
  • Purpose: The purposes of this study were to develop a predictive model and evaluate this model of turnover in hospital nurses. Methods: Participants were 1,565 nurses from a tertiary hospital in South Korea. Descriptive statistics and a decision-tree analysis were performed using the SPSS WIN 23.0 program. Results: The turnover groups were presented in eleven different pathways by decision tree analysis. There were three high-risk groups with a higher turnover rate than the average, and eight low-risk groups with a lower turnover rate. Among them, two low-risk groups had a 0% turnover rate. The groups were classified according to general characteristics such as position, period of temporary position, clinical career at last working unit, total clinical career, and period of leave of absence. The accuracy of the model was 83.2%, sensitivity 63.7%, and specificity 98.1%. Conclusion: This predictive model of turnover may be used to screen the turnover risk groups and contribute for decreasing the turnover of hospital nurses in South Korea.

Performance Comparison Analysis of Artificial Intelligence Models for Estimating Remaining Capacity of Lithium-Ion Batteries

  • Kyu-Ha Kim;Byeong-Soo Jung;Sang-Hyun Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.3
    • /
    • pp.310-314
    • /
    • 2023
  • The purpose of this study is to predict the remaining capacity of lithium-ion batteries and evaluate their performance using five artificial intelligence models, including linear regression analysis, decision tree, random forest, neural network, and ensemble model. We is in the study, measured Excel data from the CS2 lithium-ion battery was used, and the prediction accuracy of the model was measured using evaluation indicators such as mean square error, mean absolute error, coefficient of determination, and root mean square error. As a result of this study, the Root Mean Square Error(RMSE) of the linear regression model was 0.045, the decision tree model was 0.038, the random forest model was 0.034, the neural network model was 0.032, and the ensemble model was 0.030. The ensemble model had the best prediction performance, with the neural network model taking second place. The decision tree model and random forest model also performed quite well, and the linear regression model showed poor prediction performance compared to other models. Therefore, through this study, ensemble models and neural network models are most suitable for predicting the remaining capacity of lithium-ion batteries, and decision tree and random forest models also showed good performance. Linear regression models showed relatively poor predictive performance. Therefore, it was concluded that it is appropriate to prioritize ensemble models and neural network models in order to improve the efficiency of battery management and energy systems.

Interpretability Comparison of Popular Decision Tree Algorithms (대표적인 의사결정나무 알고리즘의 해석력 비교)

  • Hong, Jung-Sik;Hwang, Geun-Seong
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.2
    • /
    • pp.15-23
    • /
    • 2021
  • Most of the open-source decision tree algorithms are based on three splitting criteria (Entropy, Gini Index, and Gain Ratio). Therefore, the advantages and disadvantages of these three popular algorithms need to be studied more thoroughly. Comparisons of the three algorithms were mainly performed with respect to the predictive performance. In this work, we conducted a comparative experiment on the splitting criteria of three decision trees, focusing on their interpretability. Depth, homogeneity, coverage, lift, and stability were used as indicators for measuring interpretability. To measure the stability of decision trees, we present a measure of the stability of the root node and the stability of the dominating rules based on a measure of the similarity of trees. Based on 10 data collected from UCI and Kaggle, we compare the interpretability of DT (Decision Tree) algorithms based on three splitting criteria. The results show that the GR (Gain Ratio) branch-based DT algorithm performs well in terms of lift and homogeneity, while the GINI (Gini Index) and ENT (Entropy) branch-based DT algorithms performs well in terms of coverage. With respect to stability, considering both the similarity of the dominating rule or the similarity of the root node, the DT algorithm according to the ENT splitting criterion shows the best results.

Mapping Biodiversity throughoptimized selection of input variables in decision tree models (의사결정나무 변수 선정 방법을 적용한 대축적 생물다양성 지도 구축)

  • Kim, Do Yeon;Heo, Joon;Kim, Chang Jae
    • Journal of Environmental Impact Assessment
    • /
    • v.20 no.5
    • /
    • pp.663-673
    • /
    • 2011
  • In the face of accelerating biodiversity loss and its significance in our coexistence with nature, biodiversity is becoming more crucial in sustainable development perspective. To estimate biodiversity in the future which provides valuable information for decision making system especially in the national level, a quantitative approach must be studied forehand as a baseline of the present status. In this study, we developed a large-scale map of Plant Species Richness (PSR, typical indicator of biodiversity) for Young-dong and Pyung-chang provinces. Due to the accessibility of appropriate data and advance of modelling techniques, reduction of variables without deteriorating the predictive power is considered by applying Genetic algorithm. In addition, a number of Correctly Classified Instances (CCI) with 10-fold cross validation which indicates the predictive power, was carried out for evaluation. This study, as a fundamental baseline, will be beneficial in future land work as well as ecosystem restoration business or other relevant decision making agenda.

Data Mining Approach to Clinical Decision Support System for Hypertension Management (고혈압관리를 위한 의사지원결정시스템의 데이터마이닝 접근)

  • 김태수;채영문;조승연;윤진희;김도마
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.203-212
    • /
    • 2002
  • This study examined the predictive power of data mining algorithms by comparing the performance of logistic regression and decision tree algorithm, called CHAID (Chi-squared Automatic Interaction Detection), On the contrary to the previous studies, decision tree performed better than logistic regression. We have also developed a CDSS (Clinical Decision Support System) with three modules (doctor, nurse, and patient) based on data warehouse architecture. Data warehouse collects and integrates relevant information from various databases from hospital information system (HIS ). This system can help improve decision making capability of doctors and improve accessibility of educational material for patients.

  • PDF

Tree size determination for classification ensemble

  • Choi, Sung Hoon;Kim, Hyunjoong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.1
    • /
    • pp.255-264
    • /
    • 2016
  • Classification is a predictive modeling for a categorical target variable. Various classification ensemble methods, which predict with better accuracy by combining multiple classifiers, became a powerful machine learning and data mining paradigm. Well-known methodologies of classification ensemble are boosting, bagging and random forest. In this article, we assume that decision trees are used as classifiers in the ensemble. Further, we hypothesized that tree size affects classification accuracy. To study how the tree size in uences accuracy, we performed experiments using twenty-eight data sets. Then we compare the performances of ensemble algorithms; bagging, double-bagging, boosting and random forest, with different tree sizes in the experiment.

A Development of a Tailored Follow up Management Model Using the Data Mining Technique on Hypertension (데이터마이닝 기법을 활용한 맞춤형 고혈압 사후관리 모형 개발)

  • Park, Il-Su;Yong, Wang-Sik;Kim, Yu-Mi;Kang, Sung-Hong;Han, Jun-Tae
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.639-647
    • /
    • 2008
  • This study used the characteristics of the knowledge discovery and data mining algorithms to develop tailored hypertension follow up management model - hypertension care predictive model and hypertension care compliance segmentation model - for hypertension management using the Korea National Health Insurance Corporation database(the insureds’ screening and health care benefit data). This study validated the predictive power of data mining algorithms by comparing the performance of logistic regression, decision tree, and ensemble technique. On the basis of internal and external validation, it was found that the model performance of logistic regression method was the best among the above three techniques on hypertension care predictive model and hypertension care compliance segmentation model was developed by Decision tree analysis. This study produced several factors affecting the outbreak of hypertension using screening. It is considered to be a contributing factor towards the nation’s building of a Hypertension follow up Management System in the near future by bringing forth representative results on the rise and care of hypertension.

Predictive Analytics Model for Death Accidents in Building Projects by Trade - Based on Decision Tree- (PA기법을 이용한 건축공사 공종별 사망사고 예측모델 개발에 관한 연구 - 의사결정나무를 중심으로 -)

  • Choi, Jeong Won;Kim, Han Soo
    • Korean Journal of Construction Engineering and Management
    • /
    • v.22 no.5
    • /
    • pp.55-65
    • /
    • 2021
  • Compared with other industries, construction industry shows a higher rate of death accidents and recently companies' legal responsibilities are to be increasingly enforced. The trend causes tremendous concerns for construction firms and increases the importance of forecasting and pro-actively managing death accidents in construction fields. The objective of the study is to develop a predictive analytics model for forecasting death accidents in building projects based on a decision tree technique, which enables to forecast the probabilities of death accidents by trade. The use of the model helps to decrease risks of legal punishments and to assist the safe execution of building projects by forecasting and pro-actively managing death accidents.

Using Mechanical Learning Analysis of Determinants of Housing Sales and Establishment of Forecasting Model (기계학습을 활용한 주택매도 결정요인 분석 및 예측모델 구축)

  • Kim, Eun-mi;Kim, Sang-Bong;Cho, Eun-seo
    • Journal of Cadastre & Land InformatiX
    • /
    • v.50 no.1
    • /
    • pp.181-200
    • /
    • 2020
  • This study used the OLS model to estimate the determinants affecting the tenure of a home and then compared the predictive power of each model with SVM, Decision Tree, Random Forest, Gradient Boosting, XGBooest and LightGBM. There is a difference from the preceding study in that the Stacking model, one of the ensemble models, can be used as a base model to establish a more predictable model to identify the volume of housing transactions in the housing market. OLS analysis showed that sales profits, housing prices, the number of household members, and the type of residential housing (detached housing, apartments) affected the period of housing ownership, and compared the predictability of the machine learning model with RMSE, the results showed that the machine learning model had higher predictability. Afterwards, the predictive power was compared by applying each machine learning after rebuilding the data with the influencing variables, and the analysis showed the best predictive power of Random Forest. In addition, the most predictable Random Forest, Decision Tree, Gradient Boosting, and XGBooost models were applied as individual models, and the Stacking model was constructed using Linear, Ridge, and Lasso models as meta models. As a result of the analysis, the RMSE value in the Ridge model was the lowest at 0.5181, thus building the highest predictive model.

A Predictive Model using Decision Tree Method on Demand for Alternative Feeding Education by Nurses (의사결정나무분석법을 이용한 간호사의 대체수유교육요구 예측모형)

  • Oh, Jin-A;Yoon, Chae-Min;Kim, Byung-Su
    • Child Health Nursing Research
    • /
    • v.16 no.1
    • /
    • pp.84-92
    • /
    • 2010
  • Purpose: One of the main reasons why mothers quit breast feeding is that the volume of breast milk is inadequate due to insufficiency in suckling. We believe suckling experience may be a factor affecting nipple confusion. So an alternative feeding method, namely cup, spoon, finger, or nasogastric tube feeding may be needed to prevent nipple confusion. The purpose of this study was to construct a predictive model for demand for alternative feeding education by nurses. Methods: A descriptive design with structured self-report questionnaires was used for this study. Data from 175 nurses working in hospitals in Busan were collected between April 1 and 15, 2009. Data were analyzed by decision tree method, one of the data mining techniques using SAS 9.1 and Enterprise Miner 4.3 program. Results: Of the nurses, 81.1% demanded alternative feeding education and 5 factors showed that most of them expressed intention to pay, desire to know about alternative feeding, age, and learning experience. From these results, the derived model is considered appropriative for explaining and predicting demand for alternative feeding education. Conclusion: This confirms that knowledge and compliance in alternative breast feeding for newborn babies should be correct and any inaccuracies or insufficient information should be supplemented.