• Title/Summary/Keyword: Support Vector Model

Search Result 873, Processing Time 0.024 seconds

A Study on Statistical Forecasting Models of PM10 in Pohang Region by the Variable Transformation (변수변환을 통한 포항지역 미세먼지의 통계적 예보모형에 관한 연구)

  • Lee, Yung-Seop;Kim, Hyun-Goo;Park, Jong-Seok;Kim, Hee-Kyung
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.22 no.5
    • /
    • pp.614-626
    • /
    • 2006
  • Using the data of three environmental monitoring sites in Pohang area(KME112, KME113, and KME114), statistical forecasting models of the daily maximum and mean values of PM10 have been developed. Since the distributions of the daily maximum and mean PM10 values are skewed, which are similar to the Weibull distribution, these values were log-transformed to increase prediction accuracy by approximating the normal distribution. Three statistical forecasting models, which are regression, neural networks(NN) and support vector regression(SVR), were built using the log-transformed response variables, i.e., log(max(PM10)) or log(mean (PM10)). Also, the forecasting models were validated by the measure of RMSE, CORR, and IOA for the model comparison and accuracy. The improvement rate of IOA before and after the log-transformation in the daily maximum PM10 prediction was 12.7% for the regression and 22.5% for NN. In particular, 42.7% was improved for SVR method. In the case of the daily mean PM10 prediction, IOA value was improved by 5.1% for regression, 6.5% for NN, and 6.3% for SVR method. As a conclusion, SVR method was found to be performed better than the other methods in the point of the model accuracy and fitness views.

Classification of ratings in online reviews (온라인 리뷰에서 평점의 분류)

  • Choi, Dongjun;Choi, Hosik;Park, Changyi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.845-854
    • /
    • 2016
  • Sentiment analysis or opinion mining is a technique of text mining employed to identify subjective information or opinions of an individual from documents in blogs, reviews, articles, or social networks. In the literature, only a problem of binary classification of ratings based on review texts in an online review. However, because there can be positive or negative reviews as well as neutral reviews, a multi-class classification will be more appropriate than the binary classification. To this end, we consider the multi-class classification of ratings based on review texts. In the preprocessing stage, we extract words related with ratings using chi-square statistic. Then the extracted words are used as input variables to multi-class classifiers such as support vector machines and proportional odds model to compare their predictive performances.

Development of game indicators and winning forecasting models with game data (게임 데이터를 이용한 지표 개발과 승패예측모형 설계)

  • Ku, Jimin;Kim, Jaehee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.237-250
    • /
    • 2017
  • A new field of e-sports gains the great popularity in Korea as well as abroad. AOS (aeon of strife) genre games are quickly gaining popularity with gamers from all over the world and the game companies hold game competitions. The e-sports broadcasting teams and webzines use a variety of statistical indicators. In this paper, as an AOS genre game, League of Legends game data is used for statistical analysis using the indicators to predict the outcome. We develop new indicators with the factor analysis to improve existing indicators. Also we consider discriminant function, neural network model, and SVM (support vector machine) for make winning forecasting models. As a result, the new position indicators reflect the nature of the role in the game and winning forecasting models show more than 95 percent accuracy.

Prediction of Protein-Protein Interaction Sites Based on 3D Surface Patches Using SVM (SVM 모델을 이용한 3차원 패치 기반 단백질 상호작용 사이트 예측기법)

  • Park, Sung-Hee;Hansen, Bjorn
    • The KIPS Transactions:PartD
    • /
    • v.19D no.1
    • /
    • pp.21-28
    • /
    • 2012
  • Predication of protein interaction sites for monomer structures can reduce the search space for protein docking and has been regarded as very significant for predicting unknown functions of proteins from their interacting proteins whose functions are known. In the other hand, the prediction of interaction sites has been limited in crystallizing weakly interacting complexes which are transient and do not form the complexes stable enough for obtaining experimental structures by crystallization or even NMR for the most important protein-protein interactions. This work reports the calculation of 3D surface patches of complex structures and their properties and a machine learning approach to build a predictive model for the 3D surface patches in interaction and non-interaction sites using support vector machine. To overcome classification problems for class imbalanced data, we employed an under-sampling technique. 9 properties of the patches were calculated from amino acid compositions and secondary structure elements. With 10 fold cross validation, the predictive model built from SVM achieved an accuracy of 92.7% for classification of 3D patches in interaction and non-interaction sites from 147 complexes.

Endpoint Detection Using Hybrid Algorithm of PLS and SVM (PLS와 SVM복합 알고리즘을 이용한 식각 종료점 검출)

  • Lee, Yun-Keun;Han, Yi-Seul;Hong, Sang-Jeen;Han, Seung-Soo
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.24 no.9
    • /
    • pp.701-709
    • /
    • 2011
  • In semiconductor wafer fabrication, etching is one of the most critical processes, by which a material layer is selectively removed. Because of difficulty to correct a mistake caused by over etching, it is critical that etch should be performed correctly. This paper proposes a new approach for etch endpoint detection of small open area wafers. The traditional endpoint detection technique uses a few manually selected wavelengths, which are adequate for large open areas. As the integrated circuit devices continue to shrink in geometry and increase in device density, detecting the endpoint for small open areas presents a serious challenge to process engineers. In this work, a high-resolution optical emission spectroscopy (OES) sensor is used to provide the necessary sensitivity for detecting subtle endpoint signal. Partial Least Squares (PLS) method is used to analyze the OES data which reduces dimension of the data and increases gap between classes. Support Vector Machine (SVM) is employed to detect endpoint using the data after PLS. SVM classifies normal etching state and after endpoint state. Two data sets from OES are used in training PLS and SVM. The other data sets are used to test the performance of the model. The results show that the trained PLS and SVM hybrid algorithm model detects endpoint accurately.

Prediction of Photovoltaic Power Generation Based on Machine Learning Considering the Influence of Particulate Matter (미세먼지의 영향을 고려한 머신러닝 기반 태양광 발전량 예측)

  • Sung, Sangkyung;Cho, Youngsang
    • Environmental and Resource Economics Review
    • /
    • v.28 no.4
    • /
    • pp.467-495
    • /
    • 2019
  • Uncertainty of renewable energy such as photovoltaic(PV) power is detrimental to the flexibility of the power system. Therefore, precise prediction of PV power generation is important to make the power system stable. The purpose of this study is to forecast PV power generation using meteorological data including particulate matter(PM). In this study, PV power generation is predicted by support vector machine using RBF kernel function based on machine learning. Comparing the forecasting performances by including or excluding PM variable in predictor variables, we find that the forecasting model considering PM is better. Forecasting models considering PM variable show error reduction of 1.43%, 3.60%, and 3.88% in forecasting power generation between 6am~8pm, between 12pm~2pm, and at 1pm, respectively. Especially, the accuracy of the forecasting model including PM variable is increased in daytime when PV power generation is high.

Prediction of replacement period of shield TBM disc cutter using SVM (SVM 기법을 이용한 쉴드 TBM 디스크 커터 교환 주기 예측)

  • La, You-Sung;Kim, Myung-In;Kim, Bumjoo
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.21 no.5
    • /
    • pp.641-656
    • /
    • 2019
  • In this study, a machine learning method was proposed to use in predicting optimal replacement period of shield TBM (Tunnel Boring Machine) disc cutter. To do this, a large dataset of ground condition, disc cutter replacement records and TBM excavation-related data, collected from a shield TBM tunnel site in Korea, was built and they were used to construct a disc cutter replacement period prediction model using a machine learning algorithm, SVM (Support Vector Machine) and to assess the performance of the model. The results showed that the performance of RBF (Radial Basis Function) SVM is the best among a total of three SVM classification functions (80% accuracy and 10% error rate on average). When compared between ground types, the more disc cutter replacement data existed, the better prediction results were obtained. From this results, it is expected that machine learning methods become very popularly used in practice in near future as more data is accumulated and the machine learning models continue to be fine-tuned.

Ensemble Machine Learning Model Based YouTube Spam Comment Detection (앙상블 머신러닝 모델 기반 유튜브 스팸 댓글 탐지)

  • Jeong, Min Chul;Lee, Jihyeon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.576-583
    • /
    • 2020
  • This paper proposes a technique to determine the spam comments on YouTube, which have recently seen tremendous growth. On YouTube, the spammers appeared to promote their channels or videos in popular videos or leave comments unrelated to the video, as it is possible to monetize through advertising. YouTube is running and operating its own spam blocking system, but still has failed to block them properly and efficiently. Therefore, we examined related studies on YouTube spam comment screening and conducted classification experiments with six different machine learning techniques (Decision tree, Logistic regression, Bernoulli Naive Bayes, Random Forest, Support vector machine with linear kernel, Support vector machine with Gaussian kernel) and ensemble model combining these techniques in the comment data from popular music videos - Psy, Katy Perry, LMFAO, Eminem and Shakira.

Prediction of Blast Vibration in Quarry Using Machine Learning Models (머신러닝 모델을 이용한 석산 개발 발파진동 예측)

  • Jung, Dahee;Choi, Yosoon
    • Tunnel and Underground Space
    • /
    • v.31 no.6
    • /
    • pp.508-519
    • /
    • 2021
  • In this study, a model was developed to predict the peak particle velocity (PPV) that affects people and the surrounding environment during blasting. Four machine learning models using the k-nearest neighbors (kNN), classification and regression tree (CART), support vector regression (SVR), and particle swarm optimization (PSO)-SVR algorithms were developed and compared with each other to predict the PPV. Mt. Yogmang located in Changwon-si, Gyeongsangnam-do was selected as a study area, and 1048 blasting data were acquired to train the machine learning models. The blasting data consisted of hole length, burden, spacing, maximum charge per delay, powder factor, number of holes, ratio of emulsion, monitoring distance and PPV. To evaluate the performance of the trained models, the mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE) were used. The PSO-SVR model showed superior performance with MAE, MSE and RMSE of 0.0348, 0.0021 and 0.0458, respectively. Finally, a method was proposed to predict the degree of influence on the surrounding environment using the developed machine learning models.

Classification Analysis for the Prediction of Underground Cultural Assets (매장문화재 예측을 위한 통계적 분류 분석)

  • Yu, Hye-Kyung;Lee, Jin-Young;Na, Jong-Hwa
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.14 no.3
    • /
    • pp.106-113
    • /
    • 2009
  • Various statistical classification methods have been used to establish prediction model of underground cultural assets in our country. Among them, linear discriminant analysis, logistic regression, decision tree, neural network, and support vector machines are used in this paper. We introduced the basic concepts of above-mentioned classification methods and applied these to the analyses of real data of I city. As a results, five different prediction models are suggested. And also model comparisons are executed by suggesting correct classification rates of the fitted models. To see the applicability of the suggested models for a new data set, simulations are carried out. R packages and programs are used in real data analyses and simulations. Especially, the detailed executing processes by R are provided for the other analyser of related area.