• Title/Summary/Keyword: support vector regression.

Search Result 554, Processing Time 0.026 seconds

Prediction of concrete compressive strength using non-destructive test results

  • Erdal, Hamit;Erdal, Mursel;Simsek, Osman;Erdal, Halil Ibrahim
    • Computers and Concrete
    • /
    • v.21 no.4
    • /
    • pp.407-417
    • /
    • 2018
  • Concrete which is a composite material is one of the most important construction materials. Compressive strength is a commonly used parameter for the assessment of concrete quality. Accurate prediction of concrete compressive strength is an important issue. In this study, we utilized an experimental procedure for the assessment of concrete quality. Firstly, the concrete mix was prepared according to C 20 type concrete, and slump of fresh concrete was about 20 cm. After the placement of fresh concrete to formworks, compaction was achieved using a vibrating screed. After 28 day period, a total of 100 core samples having 75 mm diameter were extracted. On the core samples pulse velocity determination tests and compressive strength tests were performed. Besides, Windsor probe penetration tests and Schmidt hammer tests were also performed. After setting up the data set, twelve artificial intelligence (AI) models compared for predicting the concrete compressive strength. These models can be divided into three categories (i) Functions (i.e., Linear Regression, Simple Linear Regression, Multilayer Perceptron, Support Vector Regression), (ii) Lazy-Learning Algorithms (i.e., IBk Linear NN Search, KStar, Locally Weighted Learning) (iii) Tree-Based Learning Algorithms (i.e., Decision Stump, Model Trees Regression, Random Forest, Random Tree, Reduced Error Pruning Tree). Four evaluation processes, four validation implements (i.e., 10-fold cross validation, 5-fold cross validation, 10% split sample validation & 20% split sample validation) are used to examine the performance of predictive models. This study shows that machine learning regression techniques are promising tools for predicting compressive strength of concrete.

Experimental Retrieval of Soil Moisture for Cropland in South Korea Using Sentinel-1 SAR Data (Sentinel-1 SAR 데이터를 이용한 우리나라 농지의 토양수분 산출 실험)

  • Lee, Soo-Jin;Hong, Sungwook;Cho, Jaeil;Lee, Yang-Won
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.6_1
    • /
    • pp.947-960
    • /
    • 2017
  • Soil moisture plays an important role to affect the Earth's radiative energy balance and water cycle. In general, satellite observations are useful for estimating the soil moisture content. Passive microwave satellites have an advantage of direct sensitivity on surface soil moisture. However, their coarse spatial resolutions (10-36 km) are not suitable for regional-scale hydrological applications. Meanwhile, in-situ ground observations of point-based soil moisture content have the disadvantage of spatially discontinuous information. This paper presents an experimental soil moisture retrieval using Sentinel-1 SAR (Synthetic Aperture Radar) with 10m spatial resolution for cropland in South Korea. We developed a soil moisture retrieval algorithm based on the technique of linear regression and SVR (support vector regression) using the ground observations at five in-situ sites and Sentinel-1 SAR data from April to October in 2015-2017 period. Our results showed the polarization dependency on the different soil sensitivities at backscattered signals, but no polarization dependence on the accuracies. No particular seasonal characteristics of the soil moisture retrieval imply that soil moisture is generally more affected by hydro-meteorology and land surface characteristics than by phenological factors. At the narrower range of incidence angles, the relationship between the backscattered signal and soil moisture content was more distinct because the decreasing surface interference increased the retrieval accuracies under the condition of evenly distributed soil moisture (during the raining period or on the paddy field). We had an overall error estimate of RMSE (root mean square error) of approximately 6.5%. Our soil moisture retrieval algorithm will be improved if the effects of surface roughness, geomorphology, and soil properties would be considered in the future works.

Outside Temperature Prediction Based on Artificial Neural Network for Estimating the Heating Load in Greenhouse (인공신경망 기반 온실 외부 온도 예측을 통한 난방부하 추정)

  • Kim, Sang Yeob;Park, Kyoung Sub;Ryu, Keun Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.4
    • /
    • pp.129-134
    • /
    • 2018
  • Recently, the artificial neural network (ANN) model is a promising technique in the prediction, numerical control, robot control and pattern recognition. We predicted the outside temperature of greenhouse using ANN and utilized the model in greenhouse control. The performance of ANN model was evaluated and compared with multiple regression model(MRM) and support vector machine (SVM) model. The 10-fold cross validation was used as the evaluation method. In order to improve the prediction performance, the data reduction was performed by correlation analysis and new factor were extracted from measured data to improve the reliability of training data. The backpropagation algorithm was used for constructing ANN, multiple regression model was constructed by M5 method. And SVM model was constructed by epsilon-SVM method. As the result showed that the RMSE (Root Mean Squared Error) value of ANN, MRM and SVM were 0.9256, 1.8503 and 7.5521 respectively. In addition, by applying the prediction model to greenhouse heating load calculation, it can increase the income by reducing the energy cost in the greenhouse. The heating load of the experimented greenhouse was 3326.4kcal/h and the fuel consumption was estimated to be 453.8L as the total heating time is $10000^{\circ}C/h$. Therefore, data mining technology of ANN can be applied to various agricultural fields such as precise greenhouse control, cultivation techniques, and harvest prediction, thereby contributing to the development of smart agriculture.

Evaluation of Applicability of RGB Image Using Support Vector Machine Regression for Estimation of Leaf Chlorophyll Content of Onion and Garlic (양파 마늘의 잎 엽록소 함량 추정을 위한 SVM 회귀 활용 RGB 영상 적용성 평가)

  • Lee, Dong-ho;Jeong, Chan-hee;Go, Seung-hwan;Park, Jong-hwa
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_1
    • /
    • pp.1669-1683
    • /
    • 2021
  • AI intelligent agriculture and digital agriculture are important for the science of agriculture. Leaf chlorophyll contents(LCC) are one of the most important indicators to determine the growth status of vegetable crops. In this study, a support vector machine (SVM) regression model was produced using an unmanned aerial vehicle-based RGB camera and a multispectral (MSP) sensor for onions and garlic, and the LCC estimation applicability of the RGB camera was reviewed by comparing it with the MSP sensor. As a result of this study, the RGB-based LCC model showed lower results than the MSP-based LCC model with an average R2 of 0.09, RMSE 18.66, and nRMSE 3.46%. However, the difference in accuracy between the two sensors was not large, and the accuracy did not drop significantly when compared with previous studies using various sensors and algorithms. In addition, the RGB-based LCC model reflects the field LCC trend well when compared with the actual measured value, but it tends to be underestimated at high chlorophyll concentrations. It was possible to confirm the applicability of the LCC estimation with RGB considering the economic feasibility and versatility of the RGB camera. The results obtained from this study are expected to be usefully utilized in digital agriculture as AI intelligent agriculture technology that applies artificial intelligence and big data convergence technology.

Ensemble Learning-Based Prediction of Good Sellers in Overseas Sales of Domestic Books and Keyword Analysis of Reviews of the Good Sellers (앙상블 학습 기반 국내 도서의 해외 판매 굿셀러 예측 및 굿셀러 리뷰 키워드 분석)

  • Do Young Kim;Na Yeon Kim;Hyon Hee Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.4
    • /
    • pp.173-178
    • /
    • 2023
  • As Korean literature spreads around the world, its position in the overseas publishing market has become important. As demand in the overseas publishing market continues to grow, it is essential to predict future book sales and analyze the characteristics of books that have been highly favored by overseas readers in the past. In this study, we proposed ensemble learning based prediction model and analyzed characteristics of the cumulative sales of more than 5,000 copies classified as good sellers published overseas over the past 5 years. We applied the five ensemble learning models, i.e., XGBoost, Gradient Boosting, Adaboost, LightGBM, and Random Forest, and compared them with other machine learning algorithms, i.e., Support Vector Machine, Logistic Regression, and Deep Learning. Our experimental results showed that the ensemble algorithm outperforms other approaches in troubleshooting imbalanced data. In particular, the LightGBM model obtained an AUC value of 99.86% which is the best prediction performance. Among the features used for prediction, the most important feature is the author's number of overseas publications, and the second important feature is publication in countries with the largest publication market size. The number of evaluation participants is also an important feature. In addition, text mining was performed on the four book reviews that sold the most among good-selling books. Many reviews were interested in stories, characters, and writers and it seems that support for translation is needed as many of the keywords of "translation" appear in low-rated reviews.

A Predictive Model of the Generator Output Based on the Learning of Performance Data in Power Plant (발전플랜트 성능데이터 학습에 의한 발전기 출력 추정 모델)

  • Yang, HacJin;Kim, Seong Kun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.12
    • /
    • pp.8753-8759
    • /
    • 2015
  • Establishment of analysis procedures and validated performance measurements for generator output is required to maintain stable management of generator output in turbine power generation cycle. We developed turbine expansion model and measurement validation model for the performance calculation of generator using turbine output based on ASME (American Society of Mechanical Engineers) PTC (Performance Test Code). We also developed verification model for uncertain measurement data related to the turbine and generator output. Although the model in previous researches was developed using artificial neural network and kernel regression, the verification model in this paper was based on algorithms through Support Vector Machine (SVM) model to overcome the problems of unmeasured data. The selection procedures of related variables and data window for verification learning was also developed. The model reveals suitability in the estimation procss as the learning error was in the range of about 1%. The learning model can provide validated estimations for corrective performance analysis of turbine cycle output using the predictions of measurement data loss.

A Study of the Feature Classification and the Predictive Model of Main Feed-Water Flow for Turbine Cycle (주급수 유량의 형상 분류 및 추정 모델에 대한 연구)

  • Yang, Hac Jin;Kim, Seong Kun;Choi, Kwang Hee
    • Journal of Energy Engineering
    • /
    • v.23 no.4
    • /
    • pp.263-271
    • /
    • 2014
  • Corrective thermal performance analysis is required for thermal power plants to determine performance status of turbine cycle. We developed classification method for main feed water flow to make precise correction for performance analysis based on ASME (American Society of Mechanical Engineers) PTC (Performance Test Code). The classification is based on feature identification of status of main water flow. Also we developed predictive algorithms for corrected main feed-water through Support Vector Machine (SVM) Model for each classified feature area. The results was compared to estimations using Neural Network(NN) and Kernel Regression(KR). The feature classification and predictive model of main feed-water flow provides more practical methods for corrective thermal performance analysis of turbine cycle.

Movie Popularity Classification Based on Support Vector Machine Combined with Social Network Analysis

  • Dorjmaa, Tserendulam;Shin, Taeksoo
    • Journal of Information Technology Services
    • /
    • v.16 no.3
    • /
    • pp.167-183
    • /
    • 2017
  • The rapid growth of information technology and mobile service platforms, i.e., internet, google, and facebook, etc. has led the abundance of data. Due to this environment, the world is now facing a revolution in the process that data is searched, collected, stored, and shared. Abundance of data gives us several opportunities to knowledge discovery and data mining techniques. In recent years, data mining methods as a solution to discovery and extraction of available knowledge in database has been more popular in e-commerce service fields such as, in particular, movie recommendation. However, most of the classification approaches for predicting the movie popularity have used only several types of information of the movie such as actor, director, rating score, language and countries etc. In this study, we propose a classification-based support vector machine (SVM) model for predicting the movie popularity based on movie's genre data and social network data. Social network analysis (SNA) is used for improving the classification accuracy. This study builds the movies' network (one mode network) based on initial data which is a two mode network as user-to-movie network. For the proposed method we computed degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality as centrality measures in movie's network. Those four centrality values and movies' genre data were used to classify the movie popularity in this study. The logistic regression, neural network, $na{\ddot{i}}ve$ Bayes classifier, and decision tree as benchmarking models for movie popularity classification were also used for comparison with the performance of our proposed model. To assess the classifier's performance accuracy this study used MovieLens data as an open database. Our empirical results indicate that our proposed model with movie's genre and centrality data has by approximately 0% higher accuracy than other classification models with only movie's genre data. The implications of our results show that our proposed model can be used for improving movie popularity classification accuracy.

Ensemble Machine Learning Model Based YouTube Spam Comment Detection (앙상블 머신러닝 모델 기반 유튜브 스팸 댓글 탐지)

  • Jeong, Min Chul;Lee, Jihyeon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.576-583
    • /
    • 2020
  • This paper proposes a technique to determine the spam comments on YouTube, which have recently seen tremendous growth. On YouTube, the spammers appeared to promote their channels or videos in popular videos or leave comments unrelated to the video, as it is possible to monetize through advertising. YouTube is running and operating its own spam blocking system, but still has failed to block them properly and efficiently. Therefore, we examined related studies on YouTube spam comment screening and conducted classification experiments with six different machine learning techniques (Decision tree, Logistic regression, Bernoulli Naive Bayes, Random Forest, Support vector machine with linear kernel, Support vector machine with Gaussian kernel) and ensemble model combining these techniques in the comment data from popular music videos - Psy, Katy Perry, LMFAO, Eminem and Shakira.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.