• Title/Summary/Keyword: Support Vector Machine-Regression

Search Result 375, Processing Time 0.027 seconds

Modelling the deflection of reinforced concrete beams using the improved artificial neural network by imperialist competitive optimization

  • Li, Ning;Asteris, Panagiotis G.;Tran, Trung-Tin;Pradhan, Biswajeet;Nguyen, Hoang
    • Steel and Composite Structures
    • /
    • v.42 no.6
    • /
    • pp.733-745
    • /
    • 2022
  • This study proposed a robust artificial intelligence (AI) model based on the social behaviour of the imperialist competitive algorithm (ICA) and artificial neural network (ANN) for modelling the deflection of reinforced concrete beams, abbreviated as ICA-ANN model. Accordingly, the ICA was used to adjust and optimize the parameters of an ANN model (i.e., weights and biases) aiming to improve the accuracy of the ANN model in modelling the deflection reinforced concrete beams. A total of 120 experimental datasets of reinforced concrete beams were employed for this aim. Therein, applied load, tensile reinforcement strength and the reinforcement percentage were used to simulate the deflection of reinforced concrete beams. Besides, five other AI models, such as ANN, SVM (support vector machine), GLMNET (lasso and elastic-net regularized generalized linear models), CART (classification and regression tree) and KNN (k-nearest neighbours), were also used for the comprehensive assessment of the proposed model (i.e., ICA-ANN). The comparison of the derived results with the experimental findings demonstrates that among the developed models the ICA-ANN model is that can approximate the reinforced concrete beams deflection in a more reliable and robust manner.

THREE-STAGED RISK EVALUATION MODEL FOR BIDDING ON INTERNATIONAL CONSTRUCTION PROJECTS

  • Wooyong Jung;Seung Heon Han
    • International conference on construction engineering and project management
    • /
    • 2011.02a
    • /
    • pp.534-541
    • /
    • 2011
  • Risk evaluation approaches for bidding on international construction projects are typically partitioned into three stages: country selection, project classification, and bid-cost evaluation. However, previous studies are frequently under attack in that they have several crucial limitations: 1) a dearth of studies about country selection risk tailored for the overseas construction market at a corporate level; 2) no consideration of uncertainties for input variable per se; 3) less probabilistic approaches in estimating a range of cost variance; and 4) less inclusion of covariance impacts. This study thus suggests a three-staged risk evaluation model to resolve these inherent problems. In the first stage, a country portfolio model that maximizes the expected construction market growth rate and profit rate while decreasing market uncertainty is formulated using multi-objective genetic analysis. Following this, probabilistic approaches for screening bad projects are suggested through applying various data mining methods such as discriminant logistic regression, neural network, C5.0, and support vector machine. For the last stage, the cost overrun prediction model is simulated for determining a reasonable bid cost, while considering non-parametric distribution, effects of systematic risks, and the firm's specific capability accrued in a given country. Through the three consecutive models, this study verifies that international construction risk can be allocated, reduced, and projected to some degree, thereby contributing to sustaining stable profits and revenues in both the short-term and the long-term perspective.

  • PDF

Estimation of the mechanical properties of oil palm shell aggregate concrete by novel AO-XGB model

  • Yipeng Feng;Jiang Jie;Amir Toulabi
    • Steel and Composite Structures
    • /
    • v.49 no.6
    • /
    • pp.645-666
    • /
    • 2023
  • Due to the steadily declining supply of natural coarse aggregates, the concrete industry has shifted to substituting coarse aggregates generated from byproducts and industrial waste. Oil palm shell is a substantial waste product created during the production of palm oil (OPS). When considering the usage of OPSC, building engineers must consider its uniaxial compressive strength (UCS). Obtaining UCS is expensive and time-consuming, machine learning may help. This research established five innovative hybrid AI algorithms to predict UCS. Aquila optimizer (AO) is used with methods to discover optimum model parameters. Considered models are artificial neural network (AO - ANN), adaptive neuro-fuzzy inference system (AO - ANFIS), support vector regression (AO - SVR), random forest (AO - RF), and extreme gradient boosting (AO - XGB). To achieve this goal, a dataset of OPS-produced concrete specimens was compiled. The outputs depict that all five developed models have justifiable accuracy in UCS estimation process, showing the remarkable correlation between measured and estimated UCS and models' usefulness. All in all, findings depict that the proposed AO - XGB model performed more suitable than others in predicting UCS of OPSC (with R2, RMSE, MAE, VAF and A15-index at 0.9678, 1.4595, 1.1527, 97.6469, and 0.9077). The proposed model could be utilized in construction engineering to ensure enough mechanical workability of lightweight concrete and permit its safe usage for construction aims.

Writer verification using feature selection based on genetic algorithm: A case study on handwritten Bangla dataset

  • Jaya Paul;Kalpita Dutta;Anasua Sarkar;Kaushik Roy;Nibaran Das
    • ETRI Journal
    • /
    • v.46 no.4
    • /
    • pp.648-659
    • /
    • 2024
  • Author verification is challenging because of the diversity in writing styles. We propose an enhanced handwriting verification method that combines handcrafted and automatically extracted features. The method uses a genetic algorithm to reduce the dimensionality of the feature set. We consider offline Bangla handwriting content and evaluate the proposed method using handcrafted features with a simple logistic regression, radial basis function network, and sequential minimal optimization as well as automatically extracted features using a convolutional neural network. The handcrafted features outperform the automatically extracted ones, achieving an average verification accuracy of 94.54% for 100 writers. The handcrafted features include Radon transform, histogram of oriented gradients, local phase quantization, and local binary patterns from interwriter and intrawriter content. The genetic algorithm reduces the feature dimensionality and selects salient features using a support vector machine. The top five experimental results are obtained from the optimal feature set selected using a consensus strategy. Comparisons with other methods and features confirm the satisfactory results.

Prediction and analysis of acute fish toxicity of pesticides to the rainbow trout using 2D-QSAR (2D-QSAR방법을 이용한 농약류의 무지개 송어 급성 어독성 분석 및 예측)

  • Song, In-Sik;Cha, Ji-Young;Lee, Sung-Kwang
    • Analytical Science and Technology
    • /
    • v.24 no.6
    • /
    • pp.544-555
    • /
    • 2011
  • The acute toxicity in the rainbow trout (Oncorhynchus mykiss) was analyzed and predicted using quantitative structure-activity relationships (QSAR). The aquatic toxicity, 96h $LC_{50}$ (median lethal concentration) of 275 organic pesticides, was obtained from EU-funded project DEMETRA. Prediction models were derived from 558 2D molecular descriptors, calculated in PreADMET. The linear (multiple linear regression) and nonlinear (support vector machine and artificial neural network) learning methods were optimized by taking into account the statistical parameters between the experimental and predicted p$LC_{50}$. After preprocessing, population based forward selection were used to select the best subsets of descriptors in the learning methods including 5-fold cross-validation procedure. The support vector machine model was used as the best model ($R^2_{CV}$=0.677, RMSECV=0.887, MSECV=0.674) and also correctly classified 87% for the training set according to EU regulation criteria. The MLR model could describe the structural characteristics of toxic chemicals and interaction with lipid membrane of fish. All the developed models were validated by 5 fold cross-validation and Y-scrambling test.

Comparison of resampling methods for dealing with imbalanced data in binary classification problem (이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교)

  • Park, Geun U;Jung, Inkyung
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.3
    • /
    • pp.349-374
    • /
    • 2019
  • A class imbalance problem arises when one class outnumbers the other class by a large proportion in binary data. Studies such as transforming the learning data have been conducted to solve this imbalance problem. In this study, we compared resampling methods among methods to deal with an imbalance in the classification problem. We sought to find a way to more effectively detect the minority class in the data. Through simulation, a total of 20 methods of over-sampling, under-sampling, and combined method of over- and under-sampling were compared. The logistic regression, support vector machine, and random forest models, which are commonly used in classification problems, were used as classifiers. The simulation results showed that the random under sampling (RUS) method had the highest sensitivity with an accuracy over 0.5. The next most sensitive method was an over-sampling adaptive synthetic sampling approach. This revealed that the RUS method was suitable for finding minority class values. The results of applying to some real data sets were similar to those of the simulation.

EEG Feature Engineering for Machine Learning-Based CPAP Titration Optimization in Obstructive Sleep Apnea

  • Juhyeong Kang;Yeojin Kim;Jiseon Yang;Seungwon Chung;Sungeun Hwang;Uran Oh;Hyang Woon Lee
    • International journal of advanced smart convergence
    • /
    • v.12 no.3
    • /
    • pp.89-103
    • /
    • 2023
  • Obstructive sleep apnea (OSA) is one of the most prevalent sleep disorders that can lead to serious consequences, including hypertension and/or cardiovascular diseases, if not treated promptly. Continuous positive airway pressure (CPAP) is widely recognized as the most effective treatment for OSA, which needs the proper titration of airway pressure to achieve the most effective treatment results. However, the process of CPAP titration can be time-consuming and cumbersome. There is a growing importance in predicting personalized CPAP pressure before CPAP treatment. The primary objective of this study was to optimize the CPAP titration process for obstructive sleep apnea patients through EEG feature engineering with machine learning techniques. We aimed to identify and utilize the most critical EEG features to forecast key OSA predictive indicators, ultimately facilitating more precise and personalized CPAP treatment strategies. Here, we analyzed 126 OSA patients' PSG datasets before and after the CPAP treatment. We extracted 29 EEG features to predict the features that have high importance on the OSA prediction index which are AHI and SpO2 by applying the Shapley Additive exPlanation (SHAP) method. Through extracted EEG features, we confirmed the six EEG features that had high importance in predicting AHI and SpO2 using XGBoost, Support Vector Machine regression, and Random Forest Regression. By utilizing the predictive capabilities of EEG-derived features for AHI and SpO2, we can better understand and evaluate the condition of patients undergoing CPAP treatment. The ability to predict these key indicators accurately provides more immediate insight into the patient's sleep quality and potential disturbances. This not only ensures the efficiency of the diagnostic process but also provides more tailored and effective treatment approach. Consequently, the integration of EEG analysis into the sleep study protocol has the potential to revolutionize sleep diagnostics, offering a time-saving, and ultimately more effective evaluation for patients with sleep-related disorders.

An Intelligent Intrusion Detection Model Based on Support Vector Machines and the Classification Threshold Optimization for Considering the Asymmetric Error Cost (비대칭 오류비용을 고려한 분류기준값 최적화와 SVM에 기반한 지능형 침입탐지모형)

  • Lee, Hyeon-Uk;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.157-173
    • /
    • 2011
  • As the Internet use explodes recently, the malicious attacks and hacking for a system connected to network occur frequently. This means the fatal damage can be caused by these intrusions in the government agency, public office, and company operating various systems. For such reasons, there are growing interests and demand about the intrusion detection systems (IDS)-the security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. The intrusion detection models that have been applied in conventional IDS are generally designed by modeling the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. These kinds of intrusion detection models perform well under the normal situations. However, they show poor performance when they meet a new or unknown pattern of the network attacks. For this reason, several recent studies try to adopt various artificial intelligence techniques, which can proactively respond to the unknown threats. Especially, artificial neural networks (ANNs) have popularly been applied in the prior studies because of its superior prediction accuracy. However, ANNs have some intrinsic limitations such as the risk of overfitting, the requirement of the large sample size, and the lack of understanding the prediction process (i.e. black box theory). As a result, the most recent studies on IDS have started to adopt support vector machine (SVM), the classification technique that is more stable and powerful compared to ANNs. SVM is known as a relatively high predictive power and generalization capability. Under this background, this study proposes a novel intelligent intrusion detection model that uses SVM as the classification model in order to improve the predictive ability of IDS. Also, our model is designed to consider the asymmetric error cost by optimizing the classification threshold. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, when considering total cost of misclassification in IDS, it is more reasonable to assign heavier weights on FNE rather than FPE. Therefore, we designed our proposed intrusion detection model to optimize the classification threshold in order to minimize the total misclassification cost. In this case, conventional SVM cannot be applied because it is designed to generate discrete output (i.e. a class). To resolve this problem, we used the revised SVM technique proposed by Platt(2000), which is able to generate the probability estimate. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 1,000 samples from them by using random sampling method. In addition, the SVM model was compared with the logistic regression (LOGIT), decision trees (DT), and ANN to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell 4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on SVM outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that our model reduced the total misclassification cost compared to the ANN-based intrusion detection model. As a result, it is expected that the intrusion detection model proposed in this paper would not only enhance the performance of IDS, but also lead to better management of FNE.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Discriminant analysis of grain flours for rice paper using fluorescence hyperspectral imaging system and chemometric methods

  • Seo, Youngwook;Lee, Ahyeong;Kim, Bal-Geum;Lim, Jongguk
    • Korean Journal of Agricultural Science
    • /
    • v.47 no.3
    • /
    • pp.633-644
    • /
    • 2020
  • Rice paper is an element of Vietnamese cuisine that can be used to wrap vegetables and meat. Rice and starch are the main ingredients of rice paper and their mixing ratio is important for quality control. In a commercial factory, assessment of food safety and quantitative supply is a challenging issue. A rapid and non-destructive monitoring system is therefore necessary in commercial production systems to ensure the food safety of rice and starch flour for the rice paper wrap. In this study, fluorescence hyperspectral imaging technology was applied to classify grain flours. Using the 3D hyper cube of fluorescence hyperspectral imaging (fHSI, 420 - 730 nm), spectral and spatial data and chemometric methods were applied to detect and classify flours. Eight flours (rice: 4, starch: 4) were prepared and hyperspectral images were acquired in a 5 (L) × 5 (W) × 1.5 (H) cm container. Linear discriminant analysis (LDA), partial least square discriminant analysis (PLSDA), support vector machine (SVM), classification and regression tree (CART), and random forest (RF) with a few preprocessing methods (multivariate scatter correction [MSC], 1st and 2nd derivative and moving average) were applied to classify grain flours and the accuracy was compared using a confusion matrix (accuracy and kappa coefficient). LDA with moving average showed the highest accuracy at A = 0.9362 (K = 0.9270). 1D convolutional neural network (CNN) demonstrated a classification result of A = 0.94 and showed improved classification results between mimyeon flour (MF)1 and MF2 of 0.72 and 0.87, respectively. In this study, the potential of non-destructive detection and classification of grain flours using fHSI technology and machine learning methods was demonstrated.