• Title/Summary/Keyword: tree based learning

Search Result 435, Processing Time 0.028 seconds

Predicting 30-day mortality in severely injured elderly patients with trauma in Korea using machine learning algorithms: a retrospective study

  • Jonghee Han;Su Young Yoon;Junepill Seok;Jin Young Lee;Jin Suk Lee;Jin Bong Ye;Younghoon Sul;Se Heon Kim;Hong Rye Kim
    • Journal of Trauma and Injury
    • /
    • v.37 no.3
    • /
    • pp.201-208
    • /
    • 2024
  • Purpose: The number of elderly patients with trauma is increasing; therefore, precise models are necessary to estimate the mortality risk of elderly patients with trauma for informed clinical decision-making. This study aimed to develop machine learning based predictive models that predict 30-day mortality in severely injured elderly patients with trauma and to compare the predictive performance of various machine learning models. Methods: This study targeted patients aged ≥65 years with an Injury Severity Score of ≥15 who visited the regional trauma center at Chungbuk National University Hospital between 2016 and 2022. Four machine learning models-logistic regression, decision tree, random forest, and eXtreme Gradient Boosting (XGBoost)-were developed to predict 30-day mortality. The models' performance was compared using metrics such as area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, specificity, F1 score, as well as Shapley Additive Explanations (SHAP) values and learning curves. Results: The performance evaluation of the machine learning models for predicting mortality in severely injured elderly patients with trauma showed AUC values for logistic regression, decision tree, random forest, and XGBoost of 0.938, 0.863, 0.919, and 0.934, respectively. Among the four models, XGBoost demonstrated superior accuracy, precision, recall, specificity, and F1 score of 0.91, 0.72, 0.86, 0.92, and 0.78, respectively. Analysis of important features of XGBoost using SHAP revealed associations such as a high Glasgow Coma Scale negatively impacting mortality probability, while higher counts of transfused red blood cells were positively correlated with mortality probability. The learning curves indicated increased generalization and robustness as training examples increased. Conclusions: We showed that machine learning models, especially XGBoost, can be used to predict 30-day mortality in severely injured elderly patients with trauma. Prognostic tools utilizing these models are helpful for physicians to evaluate the risk of mortality in elderly patients with severe trauma.

Default Prediction of Automobile Credit Based on Support Vector Machine

  • Chen, Ying;Zhang, Ruirui
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.75-88
    • /
    • 2021
  • Automobile credit business has developed rapidly in recent years, and corresponding default phenomena occur frequently. Credit default will bring great losses to automobile financial institutions. Therefore, the successful prediction of automobile credit default is of great significance. Firstly, the missing values are deleted, then the random forest is used for feature selection, and then the sample data are randomly grouped. Finally, six prediction models of support vector machine (SVM), random forest and k-nearest neighbor (KNN), logistic, decision tree, and artificial neural network (ANN) are constructed. The results show that these six machine learning models can be used to predict the default of automobile credit. Among these six models, the accuracy of decision tree is 0.79, which is the highest, but the comprehensive performance of SVM is the best. And random grouping can improve the efficiency of model operation to a certain extent, especially SVM.

A Study on the Analysis of RocksDB Parameters Based on Machine Learning to Improve Database Performance (데이터베이스 성능 향상을 위한 기계학습 기반의 RocksDB 파라미터 분석 연구)

  • Jin, Huijun;Choi, Won Gi;Choi, Jonghwan;Sung, Hanseung;Park, Sanghyun
    • Annual Conference of KIPS
    • /
    • 2020.11a
    • /
    • pp.69-72
    • /
    • 2020
  • Log Structured Merged Tree(LSM-Tree)구조를 사용하여 빠른 데이터 쓰기 성능을 보유한 RocksDB에는 쓰기 증폭과 공간 증폭 현상이 발생한다. 쓰기 증폭은 과도한 쓰기 연산을 유발하여 데이터 처리 성능 저하와 플래시 메모리 기반 장치의 수명 저하를 초래하며, 공간 증폭은 데이터 저장 공간 점유로 인한 저장 공간 부족 문제를 야기한다. 본 논문에서는 쓰기 증폭과 공간 증폭 완화를 위해 RocksDB 의 성능에 영향 주는 주요 파라미터를 추출하고, 기계학습 기법인 랜덤 포레스트를 사용하여 추출한 파라미터가 쓰기 증폭과 공간 증폭에 미치는 영향을 분석하였다. 실험결과 쓰기 증폭과 공간 증폭에 영향을 많이 주는 주요 요소를 선별하였고 다른 파라미터에 대비해서 성능 격차가 61.7% 더 나타낸 것을 발견하였다.

Anomaly Sewing Pattern Detection for AIoT System using Deep Learning and Decision Tree

  • Nguyen Quoc Toan;Seongwon Cho
    • Smart Media Journal
    • /
    • v.13 no.2
    • /
    • pp.85-94
    • /
    • 2024
  • Artificial Intelligence of Things (AIoT), which combines AI and the Internet of Things (IoT), has recently gained popularity. Deep neural networks (DNNs) have achieved great success in many applications. Deploying complex AI models on embedded boards, nevertheless, may be challenging due to computational limitations or intelligent model complexity. This paper focuses on an AIoT-based system for smart sewing automation using edge devices. Our technique included developing a detection model and a decision tree for a sufficient testing scenario. YOLOv5 set the stage for our defective sewing stitches detection model, to detect anomalies and classify the sewing patterns. According to the experimental testing, the proposed approach achieved a perfect score with accuracy and F1score of 1.0, False Positive Rate (FPR), False Negative Rate (FNR) of 0, and a speed of 0.07 seconds with file size 2.43MB.

Multi-Cattle Tracking Algorithm with Enhanced Trajectory Estimation in Precision Livestock Farms

  • Shujie Han;Alvaro Fuentes;Sook Yoon;Jongbin Park;Dong Sun Park
    • Smart Media Journal
    • /
    • v.13 no.2
    • /
    • pp.23-31
    • /
    • 2024
  • In precision cattle farm, reliably tracking the identity of each cattle is necessary. Effective tracking of cattle within farm environments presents a unique challenge, particularly with the need to minimize the occurrence of excessive tracking trajectories. To address this, we introduce a trajectory playback decision tree algorithm that reevaluates and cleans tracking results based on spatio-temporal relationships among trajectories. This approach considers trajectory as metadata, resulting in more realistic and accurate tracking outcomes. This algorithm showcases its robustness and capability through extensive comparisons with popular tracking models, consistently demonstrating the promotion of performance across various evaluation metrics that is HOTA, AssA, and IDF1 achieve 68.81%, 79.31%, and 84.81%.

Using Bayesian tree-based model integrated with genetic algorithm for streamflow forecasting in an urban basin

  • Nguyen, Duc Hai;Bae, Deg-Hyo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.140-140
    • /
    • 2021
  • Urban flood management is a crucial and challenging task, particularly in developed cities. Therefore, accurate prediction of urban flooding under heavy precipitation is critically important to address such a challenge. In recent years, machine learning techniques have received considerable attention for their strong learning ability and suitability for modeling complex and nonlinear hydrological processes. Moreover, a survey of the published literature finds that hybrid computational intelligent methods using nature-inspired algorithms have been increasingly employed to predict or simulate the streamflow with high reliability. The present study is aimed to propose a novel approach, an ensemble tree, Bayesian Additive Regression Trees (BART) model incorporating a nature-inspired algorithm to predict hourly multi-step ahead streamflow. For this reason, a hybrid intelligent model was developed, namely GA-BART, containing BART model integrating with Genetic algorithm (GA). The Jungrang urban basin located in Seoul, South Korea, was selected as a case study for the purpose. A database was established based on 39 heavy rainfall events during 2003 and 2020 that collected from the rain gauges and monitoring stations system in the basin. For the goal of this study, the different step ahead models will be developed based in the methods, including 1-hour, 2-hour, 3-hour, 4-hour, 5-hour, and 6-hour step ahead streamflow predictions. In addition, the comparison of the hybrid BART model with a baseline model such as super vector regression models is examined in this study. It is expected that the hybrid BART model has a robust performance and can be an optional choice in streamflow forecasting for urban basins.

  • PDF

Self Introduction Essay Classification Using Doc2Vec for Efficient Job Matching (Doc2Vec 모형에 기반한 자기소개서 분류 모형 구축 및 실험)

  • Kim, Young Soo;Moon, Hyun Sil;Kim, Jae Kyeong
    • Journal of Information Technology Services
    • /
    • v.19 no.1
    • /
    • pp.103-112
    • /
    • 2020
  • Job seekers are making various efforts to find a good company and companies attempt to recruit good people. Job search activities through self-introduction essay are nowadays one of the most active processes. Companies spend time and cost to reviewing all of the numerous self-introduction essays of job seekers. Job seekers are also worried about the possibility of acceptance of their self-introduction essays by companies. This research builds a classification model and conducted an experiments to classify self-introduction essays into pass or fail using deep learning and decision tree techniques. Real world data were classified using stratified sampling to alleviate the data imbalance problem between passed self-introduction essays and failed essays. Documents were embedded using Doc2Vec method developed from existing Word2Vec, and they were classified using logistic regression analysis. The decision tree model was chosen as a benchmark model, and K-fold cross-validation was conducted for the performance evaluation. As a result of several experiments, the area under curve (AUC) value of PV-DM results better than that of other models of Doc2Vec, i.e., PV-DBOW and Concatenate. Furthmore PV-DM classifies passed essays as well as failed essays, while PV_DBOW can not classify passed essays even though it classifies well failed essays. In addition, the classification performance of the logistic regression model embedded using the PV-DM model is better than the decision tree-based classification model. The implication of the experimental results is that company can reduce the cost of recruiting good d job seekers. In addition, our suggested model can help job candidates for pre-evaluating their self-introduction essays.

Prediction in Dissolved Oxygen Concentration and Occurrence of Hypoxia Water Mass in Jinhae Bay Based on Machine Learning Model (기계학습 모형 기반 진해만 용존산소농도 및 빈산소수괴 발생 예측)

  • Park, Seongsik;Kim, Byeong Kuk;Kim, Kyunghoi
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.34 no.3
    • /
    • pp.47-57
    • /
    • 2022
  • We carried out studies on prediction in concentration of dissolved oxygen (DO) with LSTM model and prediction in occurrence of hypoxia water mass (HWM) with decision tree. As results of study on prediction in DO concentration, a large number of Hidden node caused high complexity of model and required enough Epoch. And it was high accuracy in long Sequence length as prediction time step increased. The results of prediction in occurrence of HWM showed that the accuracy of nonHWM case was 66.1% in 30 day prediction, it was higher than 37.5% of HWM case. The reason is that the decision tree might overestimate DO concentration.

A Study on a Prototype Learning Model (프로토타입 학습 모델에 관한 연구)

  • 송두헌
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.2
    • /
    • pp.151-156
    • /
    • 2001
  • We describe a new representation for learning concepts that differs from the traditional decision tree and rule induction algorithms. Our algorithm PROLEARN learns one or more prototype per class and follows instance based classification with them. Prototype here differs from psychological term in that we can have more than one prototype per concept and also differs from other instance based algorithms since the prototype is a "ficticious ideal example". We show that PROLEARN is as good as the traditional machine learning algorithms but much move stable than them in an environment that has noise or changing training set, what we call 'stability’.tability’.

  • PDF

Drone Image Classification based on Convolutional Neural Networks (컨볼루션 신경망을 기반으로 한 드론 영상 분류)

  • Joo, Young-Do
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.5
    • /
    • pp.97-102
    • /
    • 2017
  • Recently deep learning techniques such as convolutional neural networks (CNN) have been introduced to classify high-resolution remote sensing data. In this paper, we investigated the possibility of applying CNN to crop classification of farmland images captured by drones. The farming area was divided into seven classes: rice field, sweet potato, red pepper, corn, sesame leaf, fruit tree, and vinyl greenhouse. We performed image pre-processing and normalization to apply CNN, and the accuracy of image classification was more than 98%. With the output of this study, it is expected that the transition from the existing image classification methods to the deep learning based image classification methods will be facilitated in a fast manner, and the possibility of success can be confirmed.