• Title/Summary/Keyword: Five-fold cross validation

Search Result 29, Processing Time 0.022 seconds

Machine Learning Methods for Trust-based Selection of Web Services

  • Hasnain, Muhammad;Ghani, Imran;Pasha, Muhammad F.;Jeong, Seung R.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.38-59
    • /
    • 2022
  • Web services instances can be classified into two categories, namely trusted and untrusted from users. A web service with high throughput (TP) and low response time (RT) instance values is a trusted web service. Web services are not trustworthy due to the mismatch in the guaranteed instance values and the actual values achieved by users. To perform web services selection from users' attained TP and RT values, we need to verify the correct prediction of trusted and untrusted instances from invoked web services. This accurate prediction of web services instances is used to perform the selection of web services. We propose to construct fuzzy rules to label web services instances correctly. This paper presents web services selection using a well-known machine learning algorithm, namely REPTree, for the correct prediction of trusted and untrusted instances. Performance comparison of REPTree with five machine learning models is conducted on web services datasets. We have performed experiments on web services datasets using a ten k-fold cross-validation method. To evaluate the performance of the REPTree classifier, we used accuracy metrics (Sensitivity and Specificity). Experimental results showed that web service (WS1) gained top selection score with the (47.0588%) trusted instances, and web service (WS2) was selected the least with (25.00%) trusted instances. Evaluation results of the proposed web services selection approach were found as (asymptotic sig. = 0.019), demonstrating the relationship between final selection and recommended trust score of web services.

JAYA-GBRT model for predicting the shear strength of RC slender beams without stirrups

  • Tran, Viet-Linh;Kim, Jin-Kook
    • Steel and Composite Structures
    • /
    • v.44 no.5
    • /
    • pp.691-705
    • /
    • 2022
  • Shear failure in reinforced concrete (RC) structures is very hazardous. This failure is rarely predicted and may occur without any prior signs. Accurate shear strength prediction of the RC members is challenging, and traditional methods have difficulty solving it. This study develops a JAYA-GBRT model based on the JAYA algorithm and the gradient boosting regression tree (GBRT) to predict the shear strength of RC slender beams without stirrups. Firstly, 484 tests are carefully collected and divided into training and test sets. Then, the hyperparameters of the GBRT model are determined using the JAYA algorithm and 10-fold cross-validation. The performance of the JAYA-GBRT model is compared with five well-known empirical models. The comparative results show that the JAYA-GBRT model (R2 = 0.982, RMSE = 9.466 kN, MAE = 6.299 kN, µ = 1.018, and Cov = 0.116) outperforms the other models. Moreover, the predictions of the JAYA-GBRT model are globally and locally explained using the Shapley Additive exPlanation (SHAP) method. The effective depth is determined as the most crucial parameter influencing the shear strength through the SHAP method. Finally, a Graphic User Interface (GUI) tool and a web application (WA) are developed to apply the JAYA-GBRT model for rapidly predicting the shear strength of RC slender beams without stirrups.

Accuracy Evaluation of Machine Learning Model for Concrete Aging Prediction due to Thermal Effect and Carbonation (콘크리트 탄산화 및 열효과에 의한 경년열화 예측을 위한 기계학습 모델의 정확성 검토)

  • Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.23 no.4
    • /
    • pp.81-88
    • /
    • 2023
  • Numerous factors contribute to the deterioration of reinforced concrete structures. Elevated temperatures significantly alter the composition of the concrete ingredients, consequently diminishing the concrete's strength properties. With the escalation of global CO2 levels, the carbonation of concrete structures has emerged as a critical challenge, substantially affecting concrete durability research. Assessing and predicting concrete degradation due to thermal effects and carbonation are crucial yet intricate tasks. To address this, multiple prediction models for concrete carbonation and compressive strength under thermal impact have been developed. This study employs seven machine learning algorithms-specifically, multiple linear regression, decision trees, random forest, support vector machines, k-nearest neighbors, artificial neural networks, and extreme gradient boosting algorithms-to formulate predictive models for concrete carbonation and thermal impact. Two distinct datasets, derived from reported experimental studies, were utilized for training these predictive models. Performance evaluation relied on metrics like root mean square error, mean square error, mean absolute error, and coefficient of determination. The optimization of hyperparameters was achieved through k-fold cross-validation and grid search techniques. The analytical outcomes demonstrate that neural networks and extreme gradient boosting algorithms outshine the remaining five machine learning approaches, showcasing outstanding predictive performance for concrete carbonation and thermal effect modeling.

Prediction models of rock quality designation during TBM tunnel construction using machine learning algorithms

  • Byeonghyun Hwang;Hangseok Choi;Kibeom Kwon;Young Jin Shin;Minkyu Kang
    • Geomechanics and Engineering
    • /
    • v.38 no.5
    • /
    • pp.507-515
    • /
    • 2024
  • An accurate estimation of the geotechnical parameters in front of tunnel faces is crucial for the safe construction of underground infrastructure using tunnel boring machines (TBMs). This study was aimed at developing a data-driven model for predicting the rock quality designation (RQD) of the ground formation ahead of tunnel faces. The dataset used for the machine learning (ML) model comprises seven geological and mechanical features and 564 RQD values, obtained from an earth pressure balance (EPB) shield TBM tunneling project beneath the Han River in the Republic of Korea. Four ML algorithms were employed in developing the RQD prediction model: k-nearest neighbor (KNN), support vector regression (SVR), random forest (RF), and extreme gradient boosting (XGB). The grid search and five-fold cross-validation techniques were applied to optimize the prediction performance of the developed model by identifying the optimal hyperparameter combinations. The prediction results revealed that the RF algorithm-based model exhibited superior performance, achieving a root mean square error of 7.38% and coefficient of determination of 0.81. In addition, the Shapley additive explanations (SHAP) approach was adopted to determine the most relevant features, thereby enhancing the interpretability and reliability of the developed model with the RF algorithm. It was concluded that the developed model can successfully predict the RQD of the ground formation ahead of tunnel faces, contributing to safe and efficient tunnel excavation.

Development of a deep neural network model to estimate solar radiation using temperature and precipitation (온도와 강수를 이용하여 일별 일사량을 추정하기 위한 심층 신경망 모델 개발)

  • Kang, DaeGyoon;Hyun, Shinwoo;Kim, Kwang Soo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.2
    • /
    • pp.85-96
    • /
    • 2019
  • Solar radiation is an important variable for estimation of energy balance and water cycle in natural and agricultural ecosystems. A deep neural network (DNN) model has been developed in order to estimate the daily global solar radiation. Temperature and precipitation, which would have wider availability from weather stations than other variables such as sunshine duration, were used as inputs to the DNN model. Five-fold cross-validation was applied to train and test the DNN models. Meteorological data at 15 weather stations were collected for a long term period, e.g., > 30 years in Korea. The DNN model obtained from the cross-validation had relatively small value of RMSE ($3.75MJ\;m^{-2}\;d^{-1}$) for estimates of the daily solar radiation at the weather station in Suwon. The DNN model explained about 68% of variation in observed solar radiation at the Suwon weather station. It was found that the measurements of solar radiation in 1985 and 1998 were considerably low for a small period of time compared with sunshine duration. This suggested that assessment of the quality for the observation data for solar radiation would be needed in further studies. When data for those years were excluded from the data analysis, the DNN model had slightly greater degree of agreement statistics. For example, the values of $R^2$ and RMSE were 0.72 and $3.55MJ\;m^{-2}\;d^{-1}$, respectively. Our results indicate that a DNN would be useful for the development a solar radiation estimation model using temperature and precipitation, which are usually available for downscaled scenario data for future climate conditions. Thus, such a DNN model would be useful for the impact assessment of climate change on crop production where solar radiation is used as a required input variable to a crop model.

Activity and Safety Recognition using Smart Work Shoes for Construction Worksite

  • Wang, Changwon;Kim, Young;Lee, Seung Hyun;Sung, Nak-Jun;Min, Se Dong;Choi, Min-Hyung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.2
    • /
    • pp.654-670
    • /
    • 2020
  • Workers at construction sites are easily exposed to many dangers and accidents involving falls, tripping, and missteps on stairs. However, researches on construction site monitoring system to prevent work-related injuries are still insufficient. The purpose of this study was to develop a wearable textile pressure insole sensor and examine its effectiveness in managing the real-time safety of construction workers. The sensor was designed based on the principles of parallel capacitance measurement using conductive textile and the monitoring system was developed by C# language. Three separate experiments were carried out for performance evaluation of the proposed sensor: (1) varying the distance between two capacitance plates to examine changes in capacitance charges, (2) repeatedly applying 1 N of pressure for 5,000 times to evaluate consistency, and (3) gradually increasing force by 1 N (from 1 N to 46 N) to test the linearity of the sensor value. Five subjects participated in our pilot test, which examined whether ascending and descending the stairs can be distinguished by our sensor and by weka assessment tool using k-NN algorithm. The 10-fold cross-validation method was used for analysis and the results of accuracy in identifying stair ascending and descending were 87.2% and 90.9%, respectively. By applying our sensor, the type of activity, weight-shifting patterns for balance control, and plantar pressure distribution for postural changes of the construction workers can be detected. The results of this study can be the basis for future sensor-based monitoring device development studies and fall prediction researches for construction workers.

Identification of Caenorhabditis elegans MicroRNA Targets Using a Kernel Method

  • Lee, Wha-Jin;Nam, Jin-Wu;Kim, Sung-Kyu;Zhang, Byoung-Tak
    • Genomics & Informatics
    • /
    • v.3 no.1
    • /
    • pp.15-23
    • /
    • 2005
  • Background MicroRNAs (miRNAs) are a class of noncoding RNAs found in various organisms such as plants and mammals. However, most of the mRNAs regulated by miRNAs are unknown. Furthermore, miRNA targets in genomes cannot be identified by standard sequence comparison since their complementarity to the target sequence is imperfect in general. In this paper, we propose a kernel-based method for the efficient prediction of miRNA targets. To help in distinguishing the false positives from potentially valid targets, we elucidate the features common in experimentally confirmed targets. Results The performance of our prediction method was evaluated by five-fold cross-validation. Our method showed 0.64 and 0.98 in sensitivity and in specificity, respectively. Also, the proposed method reduced the number of false positives by half compared with TargetScan. We investigated the effect of feature sets on the classification of miRNA targets. Finally, we predicted miRNA targets for several miRNAs in the Caenorhabditis elegans (C. elegans) 3' untranslated region (3' UTR) database. Condusions The targets predicted by the suggested method will help in validating more miRNA targets and ultimately in revealing the role of small RNAs in the regulation of genomes. Our algorithm for miRNA target site detection will be able to be improved by additional experimental­knowledge. Also, the increase of the number of confirmed targets is expected to reveal general structural features that can be used to improve their detection.

Prediction of Stunting Among Under-5 Children in Rwanda Using Machine Learning Techniques

  • Similien Ndagijimana;Ignace Habimana Kabano;Emmanuel Masabo;Jean Marie Ntaganda
    • Journal of Preventive Medicine and Public Health
    • /
    • v.56 no.1
    • /
    • pp.41-49
    • /
    • 2023
  • Objectives: Rwanda reported a stunting rate of 33% in 2020, decreasing from 38% in 2015; however, stunting remains an issue. Globally, child deaths from malnutrition stand at 45%. The best options for the early detection and treatment of stunting should be made a community policy priority, and health services remain an issue. Hence, this research aimed to develop a model for predicting stunting in Rwandan children. Methods: The Rwanda Demographic and Health Survey 2019-2020 was used as secondary data. Stratified 10-fold cross-validation was used, and different machine learning classifiers were trained to predict stunting status. The prediction models were compared using different metrics, and the best model was chosen. Results: The best model was developed with the gradient boosting classifier algorithm, with a training accuracy of 80.49% based on the performance indicators of several models. Based on a confusion matrix, the test accuracy, sensitivity, specificity, and F1 were calculated, yielding the model's ability to classify stunting cases correctly at 79.33%, identify stunted children accurately at 72.51%, and categorize non-stunted children correctly at 94.49%, with an area under the curve of 0.89. The model found that the mother's height, television, the child's age, province, mother's education, birth weight, and childbirth size were the most important predictors of stunting status. Conclusions: Therefore, machine-learning techniques may be used in Rwanda to construct an accurate model that can detect the early stages of stunting and offer the best predictive attributes to help prevent and control stunting in under five Rwandan children.

A ResNet based multiscale feature extraction for classifying multi-variate medical time series

  • Zhu, Junke;Sun, Le;Wang, Yilin;Subramani, Sudha;Peng, Dandan;Nicolas, Shangwe Charmant
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.5
    • /
    • pp.1431-1445
    • /
    • 2022
  • We construct a deep neural network model named ECGResNet. This model can diagnosis diseases based on 12-lead ECG data of eight common cardiovascular diseases with a high accuracy. We chose the 16 Blocks of ResNet50 as the main body of the model and added the Squeeze-and-Excitation module to learn the data information between channels adaptively. We modified the first convolutional layer of ResNet50 which has a convolutional kernel of 7 to a superposition of convolutional kernels of 8 and 16 as our feature extraction method. This way allows the model to focus on the overall trend of the ECG signal while also noticing subtle changes. The model further improves the accuracy of cardiovascular and cerebrovascular disease classification by using a fully connected layer that integrates factors such as gender and age. The ECGResNet model adds Dropout layers to both the residual block and SE module of ResNet50, further avoiding the phenomenon of model overfitting. The model was eventually trained using a five-fold cross-validation and Flooding training method, with an accuracy of 95% on the test set and an F1-score of 0.841.We design a new deep neural network, innovate a multi-scale feature extraction method, and apply the SE module to extract features of ECG data.

A Deep Learning Approach for Covid-19 Detection in Chest X-Rays

  • Sk. Shalauddin Kabir;Syed Galib;Hazrat Ali;Fee Faysal Ahmed;Mohammad Farhad Bulbul
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.3
    • /
    • pp.125-134
    • /
    • 2024
  • The novel coronavirus 2019 is called COVID-19 has outspread swiftly worldwide. An early diagnosis is more important to control its quick spread. Medical imaging mechanics, chest calculated tomography or chest X-ray, are playing a vital character in the identification and testing of COVID-19 in this present epidemic. Chest X-ray is cost effective method for Covid-19 detection however the manual process of x-ray analysis is time consuming given that the number of infected individuals keep growing rapidly. For this reason, it is very important to develop an automated COVID-19 detection process to control this pandemic. In this study, we address the task of automatic detection of Covid-19 by using a popular deep learning model namely the VGG19 model. We used 1300 healthy and 1300 confirmed COVID-19 chest X-ray images in this experiment. We performed three experiments by freezing different blocks and layers of VGG19 and finally, we used a machine learning classifier SVM for detecting COVID-19. In every experiment, we used a five-fold cross-validation method to train and validated the model and finally achieved 98.1% overall classification accuracy. Experimental results show that our proposed method using the deep learning-based VGG19 model can be used as a tool to aid radiologists and play a crucial role in the timely diagnosis of Covid-19.