• 제목/요약/키워드: categorical gradient boosting

검색결과 5건 처리시간 0.017초

A robust approach in prediction of RCFST columns using machine learning algorithm

  • Van-Thanh Pham;Seung-Eock Kim
    • Steel and Composite Structures
    • /
    • 제46권2호
    • /
    • pp.153-173
    • /
    • 2023
  • Rectangular concrete-filled steel tubular (RCFST) column, a type of concrete-filled steel tubular (CFST), is widely used in compression members of structures because of its advantages. This paper proposes a robust machine learning-based framework for predicting the ultimate compressive strength of RCFST columns under both concentric and eccentric loading. The gradient boosting neural network (GBNN), an efficient and up-to-date ML algorithm, is utilized for developing a predictive model in the proposed framework. A total of 890 experimental data of RCFST columns, which is categorized into two datasets of concentric and eccentric compression, is carefully collected to serve as training and testing purposes. The accuracy of the proposed model is demonstrated by comparing its performance with seven state-of-the-art machine learning methods including decision tree (DT), random forest (RF), support vector machines (SVM), deep learning (DL), adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), and categorical gradient boosting (CatBoost). Four available design codes, including the European (EC4), American concrete institute (ACI), American institute of steel construction (AISC), and Australian/New Zealand (AS/NZS) are refereed in another comparison. The results demonstrate that the proposed GBNN method is a robust and powerful approach to obtain the ultimate strength of RCFST columns.

Hybrid machine learning with moth-flame optimization methods for strength prediction of CFDST columns under compression

  • Quang-Viet Vu;Dai-Nhan Le;Thai-Hoan Pham;Wei Gao;Sawekchai Tangaramvong
    • Steel and Composite Structures
    • /
    • 제51권6호
    • /
    • pp.679-695
    • /
    • 2024
  • This paper presents a novel technique that combines machine learning (ML) with moth-flame optimization (MFO) methods to predict the axial compressive strength (ACS) of concrete filled double skin steel tubes (CFDST) columns. The proposed model is trained and tested with a dataset containing 125 tests of the CFDST column subjected to compressive loading. Five ML models, including extreme gradient boosting (XGBoost), gradient tree boosting (GBT), categorical gradient boosting (CAT), support vector machines (SVM), and decision tree (DT) algorithms, are utilized in this work. The MFO algorithm is applied to find optimal hyperparameters of these ML models and to determine the most effective model in predicting the ACS of CFDST columns. Predictive results given by some performance metrics reveal that the MFO-CAT model provides superior accuracy compared to other considered models. The accuracy of the MFO-CAT model is validated by comparing its predictive results with existing design codes and formulae. Moreover, the significance and contribution of each feature in the dataset are examined by employing the SHapley Additive exPlanations (SHAP) method. A comprehensive uncertainty quantification on probabilistic characteristics of the ACS of CFDST columns is conducted for the first time to examine the models' responses to variations of input variables in the stochastic environments. Finally, a web-based application is developed to predict ACS of the CFDST column, enabling rapid practical utilization without requesting any programing or machine learning expertise.

Hybrid machine learning with HHO method for estimating ultimate shear strength of both rectangular and circular RC columns

  • Quang-Viet Vu;Van-Thanh Pham;Dai-Nhan Le;Zhengyi Kong;George Papazafeiropoulos;Viet-Ngoc Pham
    • Steel and Composite Structures
    • /
    • 제52권2호
    • /
    • pp.145-163
    • /
    • 2024
  • This paper presents six novel hybrid machine learning (ML) models that combine support vector machines (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), extreme gradient boosting (XGB), and categorical gradient boosting (CGB) with the Harris Hawks Optimization (HHO) algorithm. These models, namely HHO-SVM, HHO-DT, HHO-RF, HHO-GB, HHO-XGB, and HHO-CGB, are designed to predict the ultimate strength of both rectangular and circular reinforced concrete (RC) columns. The prediction models are established using a comprehensive database consisting of 325 experimental data for rectangular columns and 172 experimental data for circular columns. The ML model hyperparameters are optimized through a combination of cross-validation technique and the HHO. The performance of the hybrid ML models is evaluated and compared using various metrics, ultimately identifying the HHO-CGB model as the top-performing model for predicting the ultimate shear strength of both rectangular and circular RC columns. The mean R-value and mean a20-index are relatively high, reaching 0.991 and 0.959, respectively, while the mean absolute error and root mean square error are low (10.302 kN and 27.954 kN, respectively). Another comparison is conducted with four existing formulas to further validate the efficiency of the proposed HHO-CGB model. The Shapely Additive Explanations method is applied to analyze the contribution of each variable to the output within the HHO-CGB model, providing insights into the local and global influence of variables. The analysis reveals that the depth of the column, length of the column, and axial loading exert the most significant influence on the ultimate shear strength of RC columns. A user-friendly graphical interface tool is then developed based on the HHO-CGB to facilitate practical and cost-effective usage.

Incorporating BERT-based NLP and Transformer for An Ensemble Model and its Application to Personal Credit Prediction

  • Sophot Ky;Ju-Hong Lee;Kwangtek Na
    • 스마트미디어저널
    • /
    • 제13권4호
    • /
    • pp.9-15
    • /
    • 2024
  • Tree-based algorithms have been the dominant methods used build a prediction model for tabular data. This also includes personal credit data. However, they are limited to compatibility with categorical and numerical data only, and also do not capture information of the relationship between other features. In this work, we proposed an ensemble model using the Transformer architecture that includes text features and harness the self-attention mechanism to tackle the feature relationships limitation. We describe a text formatter module, that converts the original tabular data into sentence data that is fed into FinBERT along with other text features. Furthermore, we employed FT-Transformer that train with the original tabular data. We evaluate this multi-modal approach with two popular tree-based algorithms known as, Random Forest and Extreme Gradient Boosting, XGBoost and TabTransformer. Our proposed method shows superior Default Recall, F1 score and AUC results across two public data sets. Our results are significant for financial institutions to reduce the risk of financial loss regarding defaulters.

대사증후군의 인지와 신체활동 실천에 영향을 미치는 요인: 데이터 마이닝 접근 (Factors influencing metabolic syndrome perception and exercising behaviors in Korean adults: Data mining approach)

  • 이수경;문미경
    • 한국산학기술학회논문지
    • /
    • 제18권12호
    • /
    • pp.581-588
    • /
    • 2017
  • 본 연구는 기계 학습법 중 하나인 XGBoost를 이용하여 대사증후군을 인지하고 신체활동을 수행하는 집단을 예측하고자 2014년 7월부터 2015년 12월까지 시도되었다. 이에 2009-2013년 지역사회건강조사를 연구자료로 사용하였고 370,430명의 성인을 분석에 포함하였다. 본 연구의 종속변수는 대사증후군의 인지 및 신체활동 실천정도에 따른 단계로 3단계로 구분하였다:Stage 1(무인지, 무 신체활동), Stage 2(인지, 무 신체활동), and Stage 3(인지, 신체활동). 예측변수로는 5년간의 지역사회건강조사 중 공통으로 수집된 문항으로부터 161개의 특성을 선택하였다. 자료 분석을 위해 R program을 이용하여 XGBoost 알고리즘을 적용하였다. 분석 결과 정확도는 0.735 이었으며, 가장 영향을 미치는 10개의 특성은 나이, 교육수준, 체중조절시도 경험, EQ-5D 운동능력, 영양표시 확인, 개인 건강보험가입 유무, EQ-5D 일상활동, 금연광고경험 여부, 통증유무, 당뇨에 대한 보건기관의 교육 경험 순으로 확인되었다. 본 연구결과는 XGBoost가 보건의료빅데이터를 이용한 질병의 예방과 관리에 영향을 주는 요인을 확인하는데 유용한 도구임을 보여주었다. 또한, 본 연구를 통해 대사증후군에 취약한 계층을 확인하고 이를 위한 교육프로그램 개발에 도움을 줄 수 있을 것으로 보인다.