• Title/Summary/Keyword: support vector regression.

Search Result 554, Processing Time 0.035 seconds

Evaluation of Classification Algorithm Performance of Sentiment Analysis Using Entropy Score (엔트로피 점수를 이용한 감성분석 분류알고리즘의 수행도 평가)

  • Park, Man-Hee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.9
    • /
    • pp.1153-1158
    • /
    • 2018
  • Online customer evaluations and social media information among a variety of information sources are critical for businesses as it influences the customer's decision making. There are limitations on the time and money that the survey will ask to identify a variety of customers' needs and complaints. The customer review data at online shopping malls provide the ideal data sources for analyzing customer sentiment about their products. In this study, we collected product reviews data on the smartphone of Samsung and Apple from Amazon. We applied five classification algorithms which are used as representative sentiment analysis techniques in previous studies. The five algorithms are based on support vector machines, bagging, random forest, classification or regression tree and maximum entropy. In this study, we proposed entropy score which can comprehensively evaluate the performance of classification algorithm. As a result of evaluating five algorithms using an entropy score, the SVMs algorithm's entropy score was ranked highest.

Fault Detection & SPC of Batch Process using Multi-way Regression Method (다축-다변량회귀분석 기법을 이용한 회분식 공정의 이상감지 및 통계적 제어 방법)

  • Woo, Kyoung Sup;Lee, Chang Jun;Han, Kyoung Hoon;Ko, Jae Wook;Yoon, En Sup
    • Korean Chemical Engineering Research
    • /
    • v.45 no.1
    • /
    • pp.32-38
    • /
    • 2007
  • A batch Process has a multi-way data structure that consists of batch-time-variable axis, so the statistical modeling of a batch process is a difficult and challenging issue to the process engineers. In this study, We applied a statistical process control technique to the general batch process data, and implemented a fault-detection and Statistical process control system that was able to detect, identify and diagnose the fault. Semiconductor etch process and semi-batch styrene-butadiene rubber process data are used to case study. Before the modeling, we pre-processed the data using the multi-way unfolding technique to decompose the data structure. Multivariate regression techniques like support vector regression and partial least squares were used to identify the relation between the process variables and process condition. Finally, we constructed the root mean squared error chart and variable contribution chart to diagnose the faults.

Development of suspended solid concentration measurement technique based on multi-spectral satellite imagery in Nakdong River using machine learning model (기계학습모형을 이용한 다분광 위성 영상 기반 낙동강 부유 물질 농도 계측 기법 개발)

  • Kwon, Siyoon;Seo, Il Won;Beak, Donghae
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.2
    • /
    • pp.121-133
    • /
    • 2021
  • Suspended Solids (SS) generated in rivers are mainly introduced from non-point pollutants or appear naturally in the water body, and are an important water quality factor that may cause long-term water pollution by being deposited. However, the conventional method of measuring the concentration of suspended solids is labor-intensive, and it is difficult to obtain a vast amount of data via point measurement. Therefore, in this study, a model for measuring the concentration of suspended solids based on remote sensing in the Nakdong River was developed using Sentinel-2 data that provides high-resolution multi-spectral satellite images. The proposed model considers the spectral bands and band ratios of various wavelength bands using a machine learning model, Support Vector Regression (SVR), to overcome the limitation of the existing remote sensing-based regression equations. The optimal combination of variables was derived using the Recursive Feature Elimination (RFE) and weight coefficients for each variable of SVR. The results show that the 705nm band belonging to the red-edge wavelength band was estimated as the most important spectral band, and the proposed SVR model produced the most accurate measurement compared with the previous regression equations. By using the RFE, the SVR model developed in this study reduces the variable dependence compared to the existing regression equations based on the single spectral band or band ratio and provides more accurate prediction of spatial distribution of suspended solids concentration.

Android Malware Detection Using Permission-Based Machine Learning Approach (머신러닝을 이용한 권한 기반 안드로이드 악성코드 탐지)

  • Kang, Seongeun;Long, Nguyen Vu;Jung, Souhwan
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.3
    • /
    • pp.617-623
    • /
    • 2018
  • This study focuses on detection of malicious code through AndroidManifest permissoion feature extracted based on Android static analysis. Features are built on the permissions of AndroidManifest, which can save resources and time for analysis. Malicious app detection model consisted of SVM (support vector machine), NB (Naive Bayes), Gradient Boosting Classifier (GBC) and Logistic Regression model which learned 1,500 normal apps and 500 malicious apps and 98% detection rate. In addition, malicious app family identification is implemented by multi-classifiers model using algorithm SVM, GPC (Gaussian Process Classifier) and GBC (Gradient Boosting Classifier). The learned family identification machine learning model identified 92% of malicious app families.

Development of the Wind Turbine Power Prediction System Using Support Vector Regression (Support Vector Regression을 이용한 풍력발전량 예측 시스템 개발)

  • Shin, Hye-Gyeong;Lee, Moon-Hwan;Lee, Jin-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2011.07a
    • /
    • pp.696-697
    • /
    • 2011
  • 신재생에너지는 기후변화협약 및 화석연료의 고갈 등으로 인해 전력계통으로의 도입 필요성은 증가하고 있으나 경제성 부재로 인해 도입 시 많은 제약이 있었다. 그러나 최근 풍력발전기의 경제성이 확보되고 있는 추세이며 일부 유럽 국가를 중심으로 전력계통에 연계하여 운전하고 있다. 특히 스페인의 경우 풍력발전기의 발전량을 예측하는 시스템을 개발하여 풍력발전량의 간헐적인 출력 특성을 보완하고 이용 효율을 향상시킬 수 있도록 다른 발전설비와 연계하여 전력계통을 운영하고 있으며, 풍력발전량을 고려한 예비력을 산정함으로써 경제적이고 안정적인 전력계통을 유지하고 있다. 또한 풍력발전기의 간헐적인 출력 특성을 보완하기 위해 에너지저장장치와의 협조 운영 가능한 시스템을 구축하는 사례가 증가하고 있으며 우리나라의 제주 스마트그리드 실증사업의 Smart Renewable이 이와 같은 경우라 할 수 있다. 본 논문에서는 기계학습이론 중 하나인 SVR을 이용한 풍력발전량 예측 시스템을 개발에 대해 기술하였으며, 행원14호기의 풍력발전량 이력데이터를 이용하여 풍력발전량 예측을 수행하였다.

  • PDF

A new method to detect cracks in plate-like structures with though-thickness cracks

  • Xiang, Jiawei;Nackenhorst, Udo;Wang, Yanxue;Jiang, Yongying;Gao, Haifeng;He, Yumin
    • Smart Structures and Systems
    • /
    • v.14 no.3
    • /
    • pp.397-418
    • /
    • 2014
  • In this paper, a simple two-step method for structural vibration-based health monitoring for beam-like structures have been extended to plate-like structures with though-thickness cracks. Crack locations and severities of plate-like structures are detected using a hybrid approach. The interval wavelet transform is employed to extract crack singularity locations from mode shape and support vector regression (SVR) is applied to predict crack serviettes form crack severity detection database (the relationship of natural frequencies and crack serviettes) using several natural frequencies as inputs. Of particular interest is the natural frequencies estimation for cracked plate-like structures using Rayleigh quotient. Only the natural frequencies and mode shapes of intact structures are needed to calculate the natural frequencies of cracked plate-like structures using a simple formula. The crack severity detection database can be easily obtained with this formula. The hybrid method is investigated using numerical simulation and its validity of the usage of interval wavelet transform and SVR are addressed.

A Study on Fog Forecasting Method through Data Mining Techniques in Jeju (데이터마이닝 기법들을 통한 제주 안개 예측 방안 연구)

  • Lee, Young-Mi;Bae, Joo-Hyun;Park, Da-Bin
    • Journal of Environmental Science International
    • /
    • v.25 no.4
    • /
    • pp.603-613
    • /
    • 2016
  • Fog may have a significant impact on road conditions. In an attempt to improve fog predictability in Jeju, we conducted machine learning with various data mining techniques such as tree models, conditional inference tree, random forest, multinomial logistic regression, neural network and support vector machine. To validate machine learning models, the results from the simulation was compared with the fog data observed over Jeju(184 ASOS site) and Gosan(185 ASOS site). Predictive rates proposed by six data mining methods are all above 92% at two regions. Additionally, we validated the performance of machine learning models with WRF (weather research and forecasting) model meteorological outputs. We found that it is still not good enough for operational fog forecast. According to the model assesment by metrics from confusion matrix, it can be seen that the fog prediction using neural network is the most effective method.

Traffic Flow Estimation System using a Hybrid Approach

  • Aung, Swe Sw;Nagayama, Itaru;Tamaki, Shiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.4
    • /
    • pp.281-291
    • /
    • 2017
  • Nowadays, as traffic jams are a daily elementary problem in both developed and developing countries, systems to monitor, predict, and detect traffic conditions are playing an important role in research fields. Comparing them, researchers have been trying to solve problems by applying many kinds of technologies, especially roadside sensors, which still have some issues, and for that reason, any one particular method by itself could not generate sufficient traffic prediction results. However, these sensors have some issues that are not useful for research. Therefore, it may not be best to use them as stand-alone methods for a traffic prediction system. On that note, this paper mainly focuses on predicting traffic conditions based on a hybrid prediction approach, which stands on accuracy comparison of three prediction models: multinomial logistic regression, decision trees, and support vector machine (SVM) classifiers. This is aimed at selecting the most suitable approach by means of integrating proficiencies from these approaches. It was also experimentally confirmed, with test cases and simulations that showed the performance of this hybrid method is more effective than individual methods.

A comparative assessment of bagging ensemble models for modeling concrete slump flow

  • Aydogmus, Hacer Yumurtaci;Erdal, Halil Ibrahim;Karakurt, Onur;Namli, Ersin;Turkan, Yusuf S.;Erdal, Hamit
    • Computers and Concrete
    • /
    • v.16 no.5
    • /
    • pp.741-757
    • /
    • 2015
  • In the last decade, several modeling approaches have been proposed and applied to estimate the high-performance concrete (HPC) slump flow. While HPC is a highly complex material, modeling its behavior is a very difficult issue. Thus, the selection and application of proper modeling methods remain therefore a crucial task. Like many other applications, HPC slump flow prediction suffers from noise which negatively affects the prediction accuracy and increases the variance. In the recent years, ensemble learning methods have introduced to optimize the prediction accuracy and reduce the prediction error. This study investigates the potential usage of bagging (Bag), which is among the most popular ensemble learning methods, in building ensemble models. Four well-known artificial intelligence models (i.e., classification and regression trees CART, support vector machines SVM, multilayer perceptron MLP and radial basis function neural networks RBF) are deployed as base learner. As a result of this study, bagging ensemble models (i.e., Bag-SVM, Bag-RT, Bag-MLP and Bag-RBF) are found superior to their base learners (i.e., SVM, CART, MLP and RBF) and bagging could noticeable optimize prediction accuracy and reduce the prediction error of proposed predictive models.