• Title/Summary/Keyword: SVM (Support Vector Machine)

Search Result 1,252, Processing Time 0.026 seconds

A study on the rock mass classification in boreholes for a tunnel design using machine learning algorithms (머신러닝 기법을 활용한 터널 설계 시 시추공 내 암반분류에 관한 연구)

  • Lee, Je-Kyum;Choi, Won-Hyuk;Kim, Yangkyun;Lee, Sean Seungwon
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.6
    • /
    • pp.469-484
    • /
    • 2021
  • Rock mass classification results have a great influence on construction schedule and budget as well as tunnel stability in tunnel design. A total of 3,526 tunnels have been constructed in Korea and the associated techniques in tunnel design and construction have been continuously developed, however, not many studies have been performed on how to assess rock mass quality and grade more accurately. Thus, numerous cases show big differences in the results according to inspectors' experience and judgement. Hence, this study aims to suggest a more reliable rock mass classification (RMR) model using machine learning algorithms, which is surging in availability, through the analyses based on various rock and rock mass information collected from boring investigations. For this, 11 learning parameters (depth, rock type, RQD, electrical resistivity, UCS, Vp, Vs, Young's modulus, unit weight, Poisson's ratio, RMR) from 13 local tunnel cases were selected, 337 learning data sets as well as 60 test data sets were prepared, and 6 machine learning algorithms (DT, SVM, ANN, PCA & ANN, RF, XGBoost) were tested for various hyperparameters for each algorithm. The results show that the mean absolute errors in RMR value from five algorithms except Decision Tree were less than 8 and a Support Vector Machine model is the best model. The applicability of the model, established through this study, was confirmed and this prediction model can be applied for more reliable rock mass classification when additional various data is continuously cumulated.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

The Prediction of Purchase Amount of Customers Using Support Vector Regression with Separated Learning Method (Support Vector Regression에서 분리학습을 이용한 고객의 구매액 예측모형)

  • Hong, Tae-Ho;Kim, Eun-Mi
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.213-225
    • /
    • 2010
  • Data mining has empowered the managers who are charge of the tasks in their company to present personalized and differentiated marketing programs to their customers with the rapid growth of information technology. Most studies on customer' response have focused on predicting whether they would respond or not for their marketing promotion as marketing managers have been eager to identify who would respond to their marketing promotion. So many studies utilizing data mining have tried to resolve the binary decision problems such as bankruptcy prediction, network intrusion detection, and fraud detection in credit card usages. The prediction of customer's response has been studied with similar methods mentioned above because the prediction of customer's response is a kind of dichotomous decision problem. In addition, a number of competitive data mining techniques such as neural networks, SVM(support vector machine), decision trees, logit, and genetic algorithms have been applied to the prediction of customer's response for marketing promotion. The marketing managers also have tried to classify their customers with quantitative measures such as recency, frequency, and monetary acquired from their transaction database. The measures mean that their customers came to purchase in recent or old days, how frequent in a period, and how much they spent once. Using segmented customers we proposed an approach that could enable to differentiate customers in the same rating among the segmented customers. Our approach employed support vector regression to forecast the purchase amount of customers for each customer rating. Our study used the sample that included 41,924 customers extracted from DMEF04 Data Set, who purchased at least once in the last two years. We classified customers from first rating to fifth rating based on the purchase amount after giving a marketing promotion. Here, we divided customers into first rating who has a large amount of purchase and fifth rating who are non-respondents for the promotion. Our proposed model forecasted the purchase amount of the customers in the same rating and the marketing managers could make a differentiated and personalized marketing program for each customer even though they were belong to the same rating. In addition, we proposed more efficient learning method by separating the learning samples. We employed two learning methods to compare the performance of proposed learning method with general learning method for SVRs. LMW (Learning Method using Whole data for purchasing customers) is a general learning method for forecasting the purchase amount of customers. And we proposed a method, LMS (Learning Method using Separated data for classification purchasing customers), that makes four different SVR models for each class of customers. To evaluate the performance of models, we calculated MAE (Mean Absolute Error) and MAPE (Mean Absolute Percent Error) for each model to predict the purchase amount of customers. In LMW, the overall performance was 0.670 MAPE and the best performance showed 0.327 MAPE. Generally, the performances of the proposed LMS model were analyzed as more superior compared to the performance of the LMW model. In LMS, we found that the best performance was 0.275 MAPE. The performance of LMS was higher than LMW in each class of customers. After comparing the performance of our proposed method LMS to LMW, our proposed model had more significant performance for forecasting the purchase amount of customers in each class. In addition, our approach will be useful for marketing managers when they need to customers for their promotion. Even if customers were belonging to same class, marketing managers could offer customers a differentiated and personalized marketing promotion.

Wildfire Severity Mapping Using Sentinel Satellite Data Based on Machine Learning Approaches (Sentinel 위성영상과 기계학습을 이용한 국내산불 피해강도 탐지)

  • Sim, Seongmun;Kim, Woohyeok;Lee, Jaese;Kang, Yoojin;Im, Jungho;Kwon, Chunguen;Kim, Sungyong
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_3
    • /
    • pp.1109-1123
    • /
    • 2020
  • In South Korea with forest as a major land cover class (over 60% of the country), many wildfires occur every year. Wildfires weaken the shear strength of the soil, forming a layer of soil that is vulnerable to landslides. It is important to identify the severity of a wildfire as well as the burned area to sustainably manage the forest. Although satellite remote sensing has been widely used to map wildfire severity, it is often difficult to determine the severity using only the temporal change of satellite-derived indices such as Normalized Difference Vegetation Index (NDVI) and Normalized Burn Ratio (NBR). In this study, we proposed an approach for determining wildfire severity based on machine learning through the synergistic use of Sentinel-1A Synthetic Aperture Radar-C data and Sentinel-2A Multi Spectral Instrument data. Three wildfire cases-Samcheok in May 2017, Gangreung·Donghae in April 2019, and Gosung·Sokcho in April 2019-were used for developing wildfire severity mapping models with three machine learning algorithms (i.e., Random Forest, Logistic Regression, and Support Vector Machine). The results showed that the random forest model yielded the best performance, resulting in an overall accuracy of 82.3%. The cross-site validation to examine the spatiotemporal transferability of the machine learning models showed that the models were highly sensitive to temporal differences between the training and validation sites, especially in the early growing season. This implies that a more robust model with high spatiotemporal transferability can be developed when more wildfire cases with different seasons and areas are added in the future.

Forensic Decision of Median Filtering Image Using a Coefficient of Variation of Fourier Transform (Fourier 변환 변이계수를 이용한 미디언 필터링 영상의 포렌식 판정)

  • RHEE, Kang Hyeon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.8
    • /
    • pp.67-73
    • /
    • 2015
  • In a distribution of digital image, there is a serious problem that is the image alteration by a forger. For the problem solution, this paper proposes the forensic decision algorithm of a median filtering (MF) image using the feature vector based on a coefficient of variation (c.v.) of Fourier transform. In the proposed algorithm, we compute Fourier transform (FT) coefficients of row and column line respectively of an image first, then c.v. between neighboring lines is computed. Subsquently, 10 Dim. feature vector is defined for the MF detection. On the experiment of MF detection, the proposed scheme is compared to MFR (Median Filter Residual) and Rhee's MF detection schemes that have the same 10 Dim. feature vector both. As a result, the performance is excellent at Unaltered, JPEG (QF=90), Down scaling (0.9) and Up scaling (1.1) images, and it showed good performance at Gaussian filtering ($3{\times}3$) image. However, in the performance evaluation of all measured items of the proposed scheme, AUC (Area Under ROC (Receiver Operating Characteristic) Curve) by the sensitivity and 1-specificity approached to 1 thus, it is confirmed that the grade of the performance evaluation is rated as 'Excellent (A)'.

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

  • Hong, Taeho;Lee, Taewon;Li, Jingjing
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.187-204
    • /
    • 2016
  • Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at China's online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China's online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum's topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers' signals or attitudes towards government policy and firms' products and services.

A Method for Prediction of Quality Defects in Manufacturing Using Natural Language Processing and Machine Learning (자연어 처리 및 기계학습을 활용한 제조업 현장의 품질 불량 예측 방법론)

  • Roh, Jeong-Min;Kim, Yongsung
    • Journal of Platform Technology
    • /
    • v.9 no.3
    • /
    • pp.52-62
    • /
    • 2021
  • Quality control is critical at manufacturing sites and is key to predicting the risk of quality defect before manufacturing. However, the reliability of manual quality control methods is affected by human and physical limitations because manufacturing processes vary across industries. These limitations become particularly obvious in domain areas with numerous manufacturing processes, such as the manufacture of major nuclear equipment. This study proposed a novel method for predicting the risk of quality defects by using natural language processing and machine learning. In this study, production data collected over 6 years at a factory that manufactures main equipment that is installed in nuclear power plants were used. In the preprocessing stage of text data, a mapping method was applied to the word dictionary so that domain knowledge could be appropriately reflected, and a hybrid algorithm, which combined n-gram, Term Frequency-Inverse Document Frequency, and Singular Value Decomposition, was constructed for sentence vectorization. Next, in the experiment to classify the risky processes resulting in poor quality, k-fold cross-validation was applied to categorize cases from Unigram to cumulative Trigram. Furthermore, for achieving objective experimental results, Naive Bayes and Support Vector Machine were used as classification algorithms and the maximum accuracy and F1-score of 0.7685 and 0.8641, respectively, were achieved. Thus, the proposed method is effective. The performance of the proposed method were compared and with votes of field engineers, and the results revealed that the proposed method outperformed field engineers. Thus, the method can be implemented for quality control at manufacturing sites.

Abnormal Crowd Behavior Detection via H.264 Compression and SVDD in Video Surveillance System (H.264 압축과 SVDD를 이용한 영상 감시 시스템에서의 비정상 집단행동 탐지)

  • Oh, Seung-Geun;Lee, Jong-Uk;Chung, Yongw-Ha;Park, Dai-Hee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.21 no.6
    • /
    • pp.183-190
    • /
    • 2011
  • In this paper, we propose a prototype system for abnormal sound detection and identification which detects and recognizes the abnormal situations by means of analyzing audio information coming in real time from CCTV cameras under surveillance environment. The proposed system is composed of two layers: The first layer is an one-class support vector machine, i.e., support vector data description (SVDD) that performs rapid detection of abnormal situations and alerts to the manager. The second layer classifies the detected abnormal sound into predefined class such as 'gun', 'scream', 'siren', 'crash', 'bomb' via a sparse representation classifier (SRC) to cope with emergency situations. The proposed system is designed in a hierarchical manner via a mixture of SVDD and SRC, which has desired characteristics as follows: 1) By fast detecting abnormal sound using SVDD trained with only normal sound, it does not perform the unnecessary classification for normal sound. 2) It ensures a reliable system performance via a SRC that has been successfully applied in the field of face recognition. 3) With the intrinsic incremental learning capability of SRC, it can actively adapt itself to the change of a sound database. The experimental results with the qualitative analysis illustrate the efficiency of the proposed method.

Active Sonar Classification Algorithm based on HOG Feature (HOG 특징 기반 능동 소나 식별 기법)

  • Shin, Hyunhak;Park, Jaihyun;Ku, Bonhwa;Seo, Iksu;Kim, Taehwan;Lim, Junseok;Ko, Hanseok;Hong, Wooyoung
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.20 no.1
    • /
    • pp.33-39
    • /
    • 2017
  • In this paper, an effective feature which is capable of classifying targets among the detections obtained from 2D range-bearing maps generated in active sonar environments is proposed. Most conventional approaches for target classification with the 2D maps have considered magnitude of peak and statistical features of the area surrounding the peak. To improve the classification performance, HOG(Histogram of Gradient) feature, which is popular for their robustness in the image textures analysis is applied. In order to classify the target signal, SVM(Support Vector Machine) method with reduced HOG feature by the PCA(Principal Component Analysis) algorithm is incorporated. The various simulations are conducted with the real clutter signal data and the synthesized target signal data. According to the simulated results, the proposed method considering HOG feature is claimed to be effective when classifying the active sonar target compared to the conventional methods.

Three Dimensional Tracking of Road Signs based on Stereo Vision Technique (스테레오 비전 기술을 이용한 도로 표지판의 3차원 추적)

  • Choi, Chang-Won;Choi, Sung-In;Park, Soon-Yong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.20 no.12
    • /
    • pp.1259-1266
    • /
    • 2014
  • Road signs provide important safety information about road and traffic conditions to drivers. Road signs include not only common traffic signs but also warning information regarding unexpected obstacles and road constructions. Therefore, accurate detection and identification of road signs is one of the most important research topics related to safe driving. In this paper, we propose a 3-D vision technique to automatically detect and track road signs in a video sequence which is acquired from a stereo vision camera mounted on a vehicle. First, color information is used to initially detect the sign candidates. Second, the SVM (Support Vector Machine) is employed to determine true signs from the candidates. Once a road sign is detected in a video frame, it is continuously tracked from the next frame until it is disappeared. The 2-D position of a detected sign in the next frame is predicted by the 3-D motion of the vehicle. Here, the 3-D vehicle motion is acquired by using the 3-D pose information of the detected sign. Finally, the predicted 2-D position is corrected by template-matching of the scaled template of the detected sign within a window area around the predicted position. Experimental results show that the proposed method can detect and track many types of road signs successfully. Tracking comparisons with two different methods are shown.