• Title/Summary/Keyword: support vector regression.

Search Result 554, Processing Time 0.026 seconds

A Comparison of Classification Methods for Credit Card Approval Using R (R의 분류방법을 이용한 신용카드 승인 분석 비교)

  • Song, Jong-Woo
    • Journal of Korean Society for Quality Management
    • /
    • v.36 no.1
    • /
    • pp.72-79
    • /
    • 2008
  • The policy for credit card approval/disapproval is based on the applier's personal and financial information. In this paper, we will analyze 2 credit card approval data with several classification methods. We identify which variables are important factors to decide the approval of credit card. Our main tool is an open-source statistical programming environment R which is freely available from http://www.r-project.org. It is getting popular recently because of its flexibility and a lot of packages (libraries) made by R-users in the world. We will use most widely used methods, LDNQDA, Logistic Regression, CART (Classification and Regression Trees), neural network, and SVM (Support Vector Machines) for comparisons.

Computer-Based Fluency Evaluation of English Speaking Tests for Koreans (한국인을 위한 영어 말하기 시험의 컴퓨터 기반 유창성 평가)

  • Jang, Byeong-Yong;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.9-20
    • /
    • 2014
  • In this paper, we propose an automatic fluency evaluation algorithm for English speaking tests. In the proposed algorithm, acoustic features are extracted from an input spoken utterance and then fluency score is computed by using support vector regression (SVR). We estimate the parameters of feature modeling and SVR using the speech signals and the corresponding scores by human raters. From the correlation analysis results, it is shown that speech rate, articulation rate, and mean length of runs are best for fluency evaluation. Experimental results show that the correlation between the human score and the SVR score is 0.87 for 3 speaking tests, which suggests the possibility of the proposed algorithm as a secondary fluency evaluation tool.

A Study on the Sentiment analysis of Google Play Store App Comment Based on WPM(Word Piece Model) (WPM(Word Piece Model)을 활용한 구글 플레이스토어 앱의 댓글 감정 분석 연구)

  • Park, jae Hoon;Koo, Myong-wan
    • Annual Conference on Human and Language Technology
    • /
    • 2016.10a
    • /
    • pp.291-295
    • /
    • 2016
  • 본 논문에서는 한국어 기본 유니트 단위로 WPM을 활용한 구글 플레이 스토어 앱의 댓글 감정분석을 수행하였다. 먼저 자동 띄어쓰기 시스템을 적용한 후, 어절단위, 형태소 분석기, WPM을 각각 적용하여 모델을 생성하고, 로지스틱 회귀(Logistic Regression), 소프트맥스 회귀(Softmax Regression), 서포트 벡터머신(Support Vector Machine, SVM)등의 알고리즘을 이용하여 댓글 감정(긍정과 부정)을 비교 분석하였다. 그 결과 어절단위, 형태소 분석기보다 WPM이 최대 25%의 향상된 결과를 얻었다. 또한 분류 과정에서 로지스틱회귀, 소프트맥스 회귀보다는 SVM 성능이 우수했으며, SVM의 기본 파라미터({'kernel':('linear'), 'c':[4]})보다 최적의 파라미터를 적용({'kernel': ('linear','rbf', 'sigmoid', 'poly'), 'C':[0.01, 0.1, 1.4.5]} 하였을 때, 최대 91%의 성능이 나타났다.

  • PDF

Finding Biomarker Genes for Type 2 Diabetes Mellitus using Chi-2 Feature Selection Method and Logistic Regression Supervised Learning Algorithm

  • Alshamlan, Hala M
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.2
    • /
    • pp.9-13
    • /
    • 2021
  • Type 2 diabetes mellitus (T2D) is a complex diabetes disease that is caused by high blood sugar, insulin resistance, and a relative lack of insulin. Many studies are trying to predict variant genes that causes this disease by using a sample disease model. In this paper we predict diabetic and normal persons by using fisher score feature selection, chi-2 feature selection and Logistic Regression supervised learning algorithm with best accuracy of 90.23%.

Prediction of Delivery Quality Assurance Via Machine Learning in Helical Tomotherapy (방사선치료 시 다양한 기계학습을 이용한 선량품질관리 결과의 예측)

  • Kyung Hwan Chang
    • Journal of radiological science and technology
    • /
    • v.47 no.4
    • /
    • pp.263-270
    • /
    • 2024
  • The objective of this study was to evaluate the accuracy and impact of leaf open time (LOT) and pitch using various machine learning models on EBT film-based delivery quality assurance (DQA) performed on 211 patients of helical tomotherapy (HT). We randomly selected passed (n=191) and failed (n=20) DQA measurements to evaluate the accuracy of the k-nearest neighbor (KNN), support vector machine (SVM), naive Bayes (NB) and logistic regression (LR) models using scale-dependent metrics such as the coefficient of determination (R2), mean squared error (MSE), and root MSE (RMSE). We evaluated the performance of the four prediction models in terms of the accuracy, precision, sensitivity, and F1-score using a confusion matrix, finding the NB and LR models to achieve optimal results. The results of this study are expected to reduce the workload of medical physicists and dosimetrists by predicting DQA results according to LOT and pitch in advance.

Locally-Weighted Polynomial Neural Network for Daily Short-Term Peak Load Forecasting

  • Yu, Jungwon;Kim, Sungshin
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.16 no.3
    • /
    • pp.163-172
    • /
    • 2016
  • Electric load forecasting is essential for effective power system planning and operation. Complex and nonlinear relationships exist between the electric loads and their exogenous factors. In addition, time-series load data has non-stationary characteristics, such as trend, seasonality and anomalous day effects, making it difficult to predict the future loads. This paper proposes a locally-weighted polynomial neural network (LWPNN), which is a combination of a polynomial neural network (PNN) and locally-weighted regression (LWR) for daily shortterm peak load forecasting. Model over-fitting problems can be prevented effectively because PNN has an automatic structure identification mechanism for nonlinear system modeling. LWR applied to optimize the regression coefficients of LWPNN only uses the locally-weighted learning data points located in the neighborhood of the current query point instead of using all data points. LWPNN is very effective and suitable for predicting an electric load series with nonlinear and non-stationary characteristics. To confirm the effectiveness, the proposed LWPNN, standard PNN, support vector regression and artificial neural network are applied to a real world daily peak load dataset in Korea. The proposed LWPNN shows significantly good prediction accuracy compared to the other methods.

A gradient boosting regression based approach for energy consumption prediction in buildings

  • Bataineh, Ali S. Al
    • Advances in Energy Research
    • /
    • v.6 no.2
    • /
    • pp.91-101
    • /
    • 2019
  • This paper proposes an efficient data-driven approach to build models for predicting energy consumption in buildings. Data used in this research is collected by installing humidity and temperature sensors at different locations in a building. In addition to this, weather data from nearby weather station is also included in the dataset to study the impact of weather conditions on energy consumption. One of the main emphasize of this research is to make feature selection independent of domain knowledge. Therefore, to extract useful features from data, two different approaches are tested: one is feature selection through principal component analysis and second is relative importance-based feature selection in original domain. The regression model used in this research is gradient boosting regression and its optimal parameters are chosen through a two staged coarse-fine search approach. In order to evaluate the performance of model, different performance evaluation metrics like r2-score and root mean squared error are used. Results have shown that best performance is achieved, when relative importance-based feature selection is used with gradient boosting regressor. Results of proposed technique has also outperformed the results of support vector machines and neural network-based approaches tested on the same dataset.

Orthonormal Polynomial based Optimal EEG Feature Extraction for Motor Imagery Brain-Computer Interface

  • Chum, Pharino;Park, Seung-Min;Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.793-798
    • /
    • 2012
  • In this paper, we explored the new method for extracting feature from the electroencephalography (EEG) signal based on linear regression technique with the orthonormal polynomial bases. At first, EEG signals from electrodes around motor cortex were selected and were filtered in both spatial and temporal filter using band pass filter for alpha and beta rhymic band which considered related to the synchronization and desynchonization of firing neurons population during motor imagery task. Signal from epoch length 1s were fitted into linear regression with Legendre polynomials bases and extract the linear regression weight as final features. We compared our feature to the state of art feature, power band feature in binary classification using support vector machine (SVM) with 5-fold cross validations for comparing the classification accuracy. The result showed that our proposed method improved the classification accuracy 5.44% in average of all subject over power band features in individual subject study and 84.5% of classification accuracy with forward feature selection improvement.

Prediction of duration and construction cost of road tunnels using Gaussian process regression

  • Mahmoodzadeh, Arsalan;Mohammadi, Mokhtar;Abdulhamid, Sazan Nariman;Ibrahim, Hawkar Hashim;Ali, Hunar Farid Hama;Nejati, Hamid Reza;Rashidi, Shima
    • Geomechanics and Engineering
    • /
    • v.28 no.1
    • /
    • pp.65-75
    • /
    • 2022
  • Time and cost of construction are key factors in decision-making during a tunnel project's planning and design phase. Estimations of time and cost of tunnel construction projects are subject to significant uncertainties caused by uncertain geotechnical and geological conditions. The Gaussian Process Regression (GPR) technique for predicting ground condition and construction time and cost of mountain tunnel projects is used in this work. The GPR model is trained with data from past mountain tunnel projects. The model is applied to a case study in which the predicted time and cost of tunnel construction using the GPR model are compared with the actual construction time and cost for model validation and reducing the uncertainty for the future projects. In addition, the results obtained from the GPR have been compared with to other models of artificial neural network (ANN) and support vector regression (SVR) that the GPR model provides more accurate results.

Prediction of California bearing ratio (CBR) for coarse- and fine-grained soils using the GMDH-model

  • Mintae Kim;Seyma Ordu;Ozkan Arslan;Junyoung Ko
    • Geomechanics and Engineering
    • /
    • v.33 no.2
    • /
    • pp.183-194
    • /
    • 2023
  • This study presents the prediction of the California bearing ratio (CBR) of coarse- and fine-grained soils using artificial intelligence technology. The group method of data handling (GMDH) algorithm, an artificial neural network-based model, was used in the prediction of the CBR values. In the design of the prediction models, various combinations of independent input variables for both coarse- and fine-grained soils have been used. The results obtained from the designed GMDH-type neural networks (GMDH-type NN) were compared with other regression models, such as linear, support vector, and multilayer perception regression methods. The performance of models was evaluated with a regression coefficient (R2), root-mean-square error (RMSE), and mean absolute error (MAE). The results showed that GMDH-type NN algorithm had higher performance than other regression methods in the prediction of CBR value for coarse- and fine-grained soils. The GMDH model had an R2 of 0.938, RMSE of 1.87, and MAE of 1.48 for the input variables {G, S, and MDD} in coarse-grained soils. For fine-grained soils, it had an R2 of 0.829, RMSE of 3.02, and MAE of 2.40, when using the input variables {LL, PI, MDD, and OMC}. The performance evaluations revealed that the GMDH-type NN models were effective in predicting CBR values of both coarse- and fine-grained soils.