• Title/Summary/Keyword: Gradient Boosting Algorithm

Search Result 68, Processing Time 0.028 seconds

Machine learning-based probabilistic predictions of shear resistance of welded studs in deck slab ribs transverse to beams

  • Vitaliy V. Degtyarev;Stephen J. Hicks
    • Steel and Composite Structures
    • /
    • v.49 no.1
    • /
    • pp.109-123
    • /
    • 2023
  • Headed studs welded to steel beams and embedded within the concrete of deck slabs are vital components of modern composite floor systems, where safety and economy depend on the accurate predictions of the stud shear resistance. The multitude of existing deck profiles and the complex behavior of studs in deck slab ribs makes developing accurate and reliable mechanical or empirical design models challenging. The paper addresses this issue by presenting a machine learning (ML) model developed from the natural gradient boosting (NGBoost) algorithm capable of producing probabilistic predictions and a database of 464 push-out tests, which is considerably larger than the databases used for developing existing design models. The proposed model outperforms models based on other ML algorithms and existing descriptive equations, including those in EC4 and AISC 360, while offering probabilistic predictions unavailable from other models and producing higher shear resistances for many cases. The present study also showed that the stud shear resistance is insensitive to the concrete elastic modulus, stud welding type, location of slab reinforcement, and other parameters considered important by existing models. The NGBoost model was interpreted by evaluating the feature importance and dependence determined with the SHapley Additive exPlanations (SHAP) method. The model was calibrated via reliability analyses in accordance with the Eurocodes to ensure that its predictions meet the required reliability level and facilitate its use in design. An interactive open-source web application was created and deployed to the cloud to allow for convenient and rapid stud shear resistance predictions with the developed model.

Prediction of compressive strength of sustainable concrete using machine learning tools

  • Lokesh Choudhary;Vaishali Sahu;Archanaa Dongre;Aman Garg
    • Computers and Concrete
    • /
    • v.33 no.2
    • /
    • pp.137-145
    • /
    • 2024
  • The technique of experimentally determining concrete's compressive strength for a given mix design is time-consuming and difficult. The goal of the current work is to propose a best working predictive model based on different machine learning algorithms such as Gradient Boosting Machine (GBM), Stacked Ensemble (SE), Distributed Random Forest (DRF), Extremely Randomized Trees (XRT), Generalized Linear Model (GLM), and Deep Learning (DL) that can forecast the compressive strength of ternary geopolymer concrete mix without carrying out any experimental procedure. A geopolymer mix uses supplementary cementitious materials obtained as industrial by-products instead of cement. The input variables used for assessing the best machine learning algorithm not only include individual ingredient quantities, but molarity of the alkali activator and age of testing as well. Myriad statistical parameters used to measure the effectiveness of the models in forecasting the compressive strength of ternary geopolymer concrete mix, it has been found that GBM performs better than all other algorithms. A sensitivity analysis carried out towards the end of the study suggests that GBM model predicts results close to the experimental conditions with an accuracy between 95.6 % to 98.2 % for testing and training datasets.

Development of Machine Learning Based Seismic Response Prediction Model for Shear Wall Structure considering Aging Deteriorations (경년열화를 고려한 전단벽 구조물의 기계학습 기반 지진응답 예측모델 개발)

  • Kim, Hyun-Su;Kim, Yukyung;Lee, So Yeon;Jang, Jun Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.24 no.2
    • /
    • pp.83-90
    • /
    • 2024
  • Machine learning is widely applied to various engineering fields. In structural engineering area, machine learning is generally used to predict structural responses of building structures. The aging deterioration of reinforced concrete structure affects its structural behavior. Therefore, the aging deterioration of R.C. structure should be consider to exactly predict seismic responses of the structure. In this study, the machine learning based seismic response prediction model was developed. To this end, four machine learning algorithms were employed and prediction performance of each algorithm was compared. A 3-story coupled shear wall structure was selected as an example structure for numerical simulation. Artificial ground motions were generated based on domestic site characteristics. Elastic modulus, damping ratio and density were changed to considering concrete degradation due to chloride penetration and carbonation, etc. Various intensity measures were used input parameters of the training database. Performance evaluation was performed using metrics like root mean square error, mean square error, mean absolute error, and coefficient of determination. The optimization of hyperparameters was achieved through k-fold cross-validation and grid search techniques. The analysis results show that neural networks and extreme gradient boosting algorithms present good prediction performance.

Inhalation Configuration Detection for COVID-19 Patient Secluded Observing using Wearable IoTs Platform

  • Sulaiman Sulmi Almutairi;Rehmat Ullah;Qazi Zia Ullah;Habib Shah
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1478-1499
    • /
    • 2024
  • Coronavirus disease (COVID-19) is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus. COVID-19 become an active epidemic disease due to its spread around the globe. The main causes of the spread are through interaction and transmission of the droplets through coughing and sneezing. The spread can be minimized by isolating the susceptible patients. However, it necessitates remote monitoring to check the breathing issues of the patient remotely to minimize the interactions for spread minimization. Thus, in this article, we offer a wearable-IoTs-centered framework for remote monitoring and recognition of the breathing pattern and abnormal breath detection for timely providing the proper oxygen level required. We propose wearable sensors accelerometer and gyroscope-based breathing time-series data acquisition, temporal features extraction, and machine learning algorithms for pattern detection and abnormality identification. The sensors provide the data through Bluetooth and receive it at the server for further processing and recognition. We collect the six breathing patterns from the twenty subjects and each pattern is recorded for about five minutes. We match prediction accuracies of all machine learning models under study (i.e. Random forest, Gradient boosting tree, Decision tree, and K-nearest neighbor. Our results show that normal breathing and Bradypnea are the most correctly recognized breathing patterns. However, in some cases, algorithm recognizes kussmaul well also. Collectively, the classification outcomes of Random Forest and Gradient Boost Trees are better than the other two algorithms.

Real-time prediction on the slurry concentration of cutter suction dredgers using an ensemble learning algorithm

  • Han, Shuai;Li, Mingchao;Li, Heng;Tian, Huijing;Qin, Liang;Li, Jinfeng
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.463-481
    • /
    • 2020
  • Cutter suction dredgers (CSDs) are widely used in various dredging constructions such as channel excavation, wharf construction, and reef construction. During a CSD construction, the main operation is to control the swing speed of cutter to keep the slurry concentration in a proper range. However, the slurry concentration cannot be monitored in real-time, i.e., there is a "time-lag effect" in the log of slurry concentration, making it difficult for operators to make the optimal decision on controlling. Concerning this issue, a solution scheme that using real-time monitored indicators to predict current slurry concentration is proposed in this research. The characteristics of the CSD monitoring data are first studied, and a set of preprocessing methods are presented. Then we put forward the concept of "index class" to select the important indices. Finally, an ensemble learning algorithm is set up to fit the relationship between the slurry concentration and the indices of the index classes. In the experiment, log data over seven days of a practical dredging construction is collected. For comparison, the Deep Neural Network (DNN), Long Short Time Memory (LSTM), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), and the Bayesian Ridge algorithm are tried. The results show that our method has the best performance with an R2 of 0.886 and a mean square error (MSE) of 5.538. This research provides an effective way for real-time predicting the slurry concentration of CSDs and can help to improve the stationarity and production efficiency of dredging construction.

  • PDF

Factors influencing metabolic syndrome perception and exercising behaviors in Korean adults: Data mining approach (대사증후군의 인지와 신체활동 실천에 영향을 미치는 요인: 데이터 마이닝 접근)

  • Lee, Soo-Kyoung;Moon, Mikyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.12
    • /
    • pp.581-588
    • /
    • 2017
  • This study was conducted to determine which factors would predict metabolic syndrome (MetS) perception and exercise by applying a machine learning classifier, or Extreme Gradient Boosting algorithm (XGBoost) from July 2014 to December 2015. Data were obtained from the Korean Community Health Survey (KCHS), representing different community-dwelling Korean adults 19 years and older, from 2009 to 2013. The dataset includes 370,430 adults. Outcomes were categorized as follows based on the perception of MetS and physical activity (PA): Stage 1 (no perception, no PA), Stage 2 (perception, no PA), and Stage 3 (perception, PA). Features common to all questionnaires for the last 5 years were selected for modeling. Overall, there were 161 features, categorical except for age and the visual analogue scale (EQ-VAS). We used the Extreme Boosting algorithm in R programming for a model to predict factors and achieved prediction accuracy in 0.735 submissions. The top 10 predictive factors in Stage 3 were: age, education level, attempt to control weight, EQ mobility, nutrition label checks, private health insurance, EQ-5D usual activities, anti-smoking advertising, EQ-VAS, education in health centers for diabetes, and dental care. In conclusion, the results showed that XGBoost can be used to identify factors influencing disease prevention and management using healthcare bigdata.

Comparative characteristic of ensemble machine learning and deep learning models for turbidity prediction in a river (딥러닝과 앙상블 머신러닝 모형의 하천 탁도 예측 특성 비교 연구)

  • Park, Jungsu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.35 no.1
    • /
    • pp.83-91
    • /
    • 2021
  • The increased turbidity in rivers during flood events has various effects on water environmental management, including drinking water supply systems. Thus, prediction of turbid water is essential for water environmental management. Recently, various advanced machine learning algorithms have been increasingly used in water environmental management. Ensemble machine learning algorithms such as random forest (RF) and gradient boosting decision tree (GBDT) are some of the most popular machine learning algorithms used for water environmental management, along with deep learning algorithms such as recurrent neural networks. In this study GBDT, an ensemble machine learning algorithm, and gated recurrent unit (GRU), a recurrent neural networks algorithm, are used for model development to predict turbidity in a river. The observation frequencies of input data used for the model were 2, 4, 8, 24, 48, 120 and 168 h. The root-mean-square error-observations standard deviation ratio (RSR) of GRU and GBDT ranges between 0.182~0.766 and 0.400~0.683, respectively. Both models show similar prediction accuracy with RSR of 0.682 for GRU and 0.683 for GBDT. The GRU shows better prediction accuracy when the observation frequency is relatively short (i.e., 2, 4, and 8 h) where GBDT shows better prediction accuracy when the observation frequency is relatively long (i.e. 48, 120, 160 h). The results suggest that the characteristics of input data should be considered to develop an appropriate model to predict turbidity.

Development of Flash Boiling Spray Prediction Model of Multi-hole GDI Injector Using Machine Learning (머신러닝을 이용한 다공형 GDI 인젝터의 플래시 보일링 분무 예측 모델 개발)

  • Chang, Mengzhao;Shin, Dalho;Pham, Quangkhai;Park, Suhan
    • Journal of ILASS-Korea
    • /
    • v.27 no.2
    • /
    • pp.57-65
    • /
    • 2022
  • The purpose of this study is to use machine learning to build a model capable of predicting the flash boiling spray characteristics. In this study, the flash boiling spray was visualized using Shadowgraph visualization technology, and then the spray image was processed with MATLAB to obtain quantitative data of spray characteristics. The experimental conditions were used as input, and the spray characteristics were used as output to train the machine learning model. For the machine learning model, the XGB (extreme gradient boosting) algorithm was used. Finally, the performance of machine learning model was evaluated using R2 and RMSE (root mean square error). In order to have enough data to train the machine learning model, this study used 12 injectors with different design parameters, and set various fuel temperatures and ambient pressures, resulting in about 12,000 data. By comparing the performance of the model with different amounts of training data, it was found that the number of training data must reach at least 7,000 before the model can show optimal performance. The model showed different prediction performances for different spray characteristics. Compared with the upstream spray angle and the downstream spray angle, the model had the best prediction performance for the spray tip penetration. In addition, the prediction performance of the model showed a relatively poor trend in the initial stage of injection and the final stage of injection. The model performance is expired to be further enhanced by optimizing the hyper-parameters input into the model.

Application of Machine Learning Techniques for Problematic Smartphone Use (스마트폰 과의존 판별을 위한 기계 학습 기법의 응용)

  • Kim, Woo-sung;Han, Jun-hee
    • Asia-Pacific Journal of Business
    • /
    • v.13 no.3
    • /
    • pp.293-309
    • /
    • 2022
  • Purpose - The purpose of this study is to explore the possibility of predicting the degree of smartphone overdependence based on mobile phone usage patterns. Design/methodology/approach - In this study, a survey conducted by Korea Internet and Security Agency(KISA) called "problematic smartphone use survey" was analyzed. The survey consists of 180 questions, and data were collected from 29,712 participants. Based on the data on the smartphone usage pattern obtained through the questionnaire, the smartphone addiction level was predicted using machine learning techniques. k-NN, gradient boosting, XGBoost, CatBoost, AdaBoost and random forest algorithms were employed. Findings - First, while various factors together influence the smartphone overdependence level, the results show that all machine learning techniques perform well to predict the smartphone overdependence level. Especially, we focus on the features which can be obtained from the smartphone log data (without psychological factors). It means that our results can be a basis for diagnostic programs to detect problematic smartphone use. Second, the results show that information on users' age, marriage and smartphone usage patterns can be used as predictors to determine whether users are addicted to smartphones. Other demographic characteristics such as sex or region did not appear to significantly affect smartphone overdependence levels. Research implications or Originality - While there are some studies that predict smartphone overdependence level using machine learning techniques, but the studies only present algorithm performance based on survey data. In this study, based on the information gain measure, questions that have more influence on the smartphone overdependence level are presented, and the performance of algorithms according to the questions is compared. Through the results of this study, it is shown that smartphone overdependence level can be predicted with less information if questions about smartphone use are given appropriately.

Income prediction of apple and pear farmers in Chungnam area by automatic machine learning with H2O.AI

  • Hyundong, Jang;Sounghun, Kim
    • Korean Journal of Agricultural Science
    • /
    • v.49 no.3
    • /
    • pp.619-627
    • /
    • 2022
  • In Korea, apples and pears are among the most important agricultural products to farmers who seek to earn money as income. Generally, farmers make decisions at various stages to maximize their income but they do not always know exactly which option will be the best one. Many previous studies were conducted to solve this problem by predicting farmers' income structure, but researchers are still exploring better approaches. Currently, machine learning technology is gaining attention as one of the new approaches for farmers' income prediction. The machine learning technique is a methodology using an algorithm that can learn independently through data. As the level of computer science develops, the performance of machine learning techniques is also improving. The purpose of this study is to predict the income structure of apples and pears using the automatic machine learning solution H2O.AI and to present some implications for apple and pear farmers. The automatic machine learning solution H2O.AI can save time and effort compared to the conventional machine learning techniques such as scikit-learn, because it works automatically to find the best solution. As a result of this research, the following findings are obtained. First, apple farmers should increase their gross income to maximize their income, instead of reducing the cost of growing apples. In particular, apple farmers mainly have to increase production in order to obtain more gross income. As a second-best option, apple farmers should decrease labor and other costs. Second, pear farmers also should increase their gross income to maximize their income but they have to increase the price of pears rather than increasing the production of pears. As a second-best option, pear farmers can decrease labor and other costs.