• Title/Summary/Keyword: SVM Model

Search Result 702, Processing Time 0.025 seconds

A Study on Predicting Construction Cost of Educational Building Project at early stage Using Support Vector Machine Technique (서포트벡터머신을 이용한 교육시설 초기 공사비 예측에 관한 연구)

  • Shin, Jae-Min;Kim, Gwang-Hee
    • The Journal of Sustainable Design and Educational Environment Research
    • /
    • v.11 no.3
    • /
    • pp.46-54
    • /
    • 2012
  • The accuracy of cost estimation at an early stage in school building project is one of the critical factors for successful completion. So various of techniques are developed to predict the construction cost accurately and expeditely. Among the techniques, Support Vector Machine(SVM) has an excellent ability for generalization performance. Therefore, the purpose of this study is to construct the prediction model for construction cost of educational building project using support vector machine technique. And to verify the accuracy of prediction model for construction cost. The performance data used in this study are 217 school building project cost which have been completed from 2004 to 2007 in Gyeonggi-Do, Korea. The result shows that average error rate was 7.48% for SVM prediction model. So using SVM model on predicting construction cost of educational building project will be a considerably effective way at the early project stage.

Category Factor Based Feature Selection for Document Classification

  • Kang Yun-Hee
    • International Journal of Contents
    • /
    • v.1 no.2
    • /
    • pp.26-30
    • /
    • 2005
  • According to the fast growth of information on the Internet, it is becoming increasingly difficult to find and organize useful information. To reduce information overload, it needs to exploit automatic text classification for handling enormous documents. Support Vector Machine (SVM) is a model that is calculated as a weighted sum of kernel function outputs. This paper describes a document classifier for web documents in the fields of Information Technology and uses SVM to learn a model, which is constructed from the training sets and its representative terms. The basic idea is to exploit the representative terms meaning distribution in coherent thematic texts of each category by simple statistics methods. Vector-space model is applied to represent documents in the categories by using feature selection scheme based on TFiDF. We apply a category factor which represents effects in category of any term to the feature selection. Experiments show the results of categorization and the correlation of vector length.

  • PDF

The Prediction of DEA based Efficiency Rating for Venture Business Using Multi-class SVM (다분류 SVM을 이용한 DEA기반 벤처기업 효율성등급 예측모형)

  • Park, Ji-Young;Hong, Tae-Ho
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.139-155
    • /
    • 2009
  • For the last few decades, many studies have tried to explore and unveil venture companies' success factors and unique features in order to identify the sources of such companies' competitive advantages over their rivals. Such venture companies have shown tendency to give high returns for investors generally making the best use of information technology. For this reason, many venture companies are keen on attracting avid investors' attention. Investors generally make their investment decisions by carefully examining the evaluation criteria of the alternatives. To them, credit rating information provided by international rating agencies, such as Standard and Poor's, Moody's and Fitch is crucial source as to such pivotal concerns as companies stability, growth, and risk status. But these types of information are generated only for the companies issuing corporate bonds, not venture companies. Therefore, this study proposes a method for evaluating venture businesses by presenting our recent empirical results using financial data of Korean venture companies listed on KOSDAQ in Korea exchange. In addition, this paper used multi-class SVM for the prediction of DEA-based efficiency rating for venture businesses, which was derived from our proposed method. Our approach sheds light on ways to locate efficient companies generating high level of profits. Above all, in determining effective ways to evaluate a venture firm's efficiency, it is important to understand the major contributing factors of such efficiency. Therefore, this paper is constructed on the basis of following two ideas to classify which companies are more efficient venture companies: i) making DEA based multi-class rating for sample companies and ii) developing multi-class SVM-based efficiency prediction model for classifying all companies. First, the Data Envelopment Analysis(DEA) is a non-parametric multiple input-output efficiency technique that measures the relative efficiency of decision making units(DMUs) using a linear programming based model. It is non-parametric because it requires no assumption on the shape or parameters of the underlying production function. DEA has been already widely applied for evaluating the relative efficiency of DMUs. Recently, a number of DEA based studies have evaluated the efficiency of various types of companies, such as internet companies and venture companies. It has been also applied to corporate credit ratings. In this study we utilized DEA for sorting venture companies by efficiency based ratings. The Support Vector Machine(SVM), on the other hand, is a popular technique for solving data classification problems. In this paper, we employed SVM to classify the efficiency ratings in IT venture companies according to the results of DEA. The SVM method was first developed by Vapnik (1995). As one of many machine learning techniques, SVM is based on a statistical theory. Thus far, the method has shown good performances especially in generalizing capacity in classification tasks, resulting in numerous applications in many areas of business, SVM is basically the algorithm that finds the maximum margin hyperplane, which is the maximum separation between classes. According to this method, support vectors are the closest to the maximum margin hyperplane. If it is impossible to classify, we can use the kernel function. In the case of nonlinear class boundaries, we can transform the inputs into a high-dimensional feature space, This is the original input space and is mapped into a high-dimensional dot-product space. Many studies applied SVM to the prediction of bankruptcy, the forecast a financial time series, and the problem of estimating credit rating, In this study we employed SVM for developing data mining-based efficiency prediction model. We used the Gaussian radial function as a kernel function of SVM. In multi-class SVM, we adopted one-against-one approach between binary classification method and two all-together methods, proposed by Weston and Watkins(1999) and Crammer and Singer(2000), respectively. In this research, we used corporate information of 154 companies listed on KOSDAQ market in Korea exchange. We obtained companies' financial information of 2005 from the KIS(Korea Information Service, Inc.). Using this data, we made multi-class rating with DEA efficiency and built multi-class prediction model based data mining. Among three manners of multi-classification, the hit ratio of the Weston and Watkins method is the best in the test data set. In multi classification problems as efficiency ratings of venture business, it is very useful for investors to know the class with errors, one class difference, when it is difficult to find out the accurate class in the actual market. So we presented accuracy results within 1-class errors, and the Weston and Watkins method showed 85.7% accuracy in our test samples. We conclude that the DEA based multi-class approach in venture business generates more information than the binary classification problem, notwithstanding its efficiency level. We believe this model can help investors in decision making as it provides a reliably tool to evaluate venture companies in the financial domain. For the future research, we perceive the need to enhance such areas as the variable selection process, the parameter selection of kernel function, the generalization, and the sample size of multi-class.

Hydrologic Disaggregation Model using Neural Networks Technique (신경망기법을 이용한 수문학적 분해모형)

  • Kim, Sung-Won
    • Journal of Wetlands Research
    • /
    • v.12 no.3
    • /
    • pp.79-97
    • /
    • 2010
  • The purpose of this research is to apply the neural networks models for the hydrologic disaggregation of the yearly pan evaporation(PE) data in Republic of Korea. The neural networks models consist of multilayer perceptron neural networks model(MLP-NNM) and support vector machine neural networks model(SVM-NNM), respectively. And, for the evaluation of the neural networks models, they are composed of training and test performances, respectively. The three types of data such as the historic, the generated, and the mixed data are used for the training performance. The only historic data, however, is used for the testing performance. The application of MLP-NNM and SVM-NNM for the hydrologic disaggregation of nonlinear time series data is evaluated from results of this research. Four kinds of the statistical index for the evaluation are suggested; CC, RMSE, E, and AARE, respectively. Homogeneity test using ANOVA and Mann-Whitney U test, furthermore, is carried out for the observed and calculated monthly PE data. We can construct the credible monthly PE data from the hydrologic disaggregation of the yearly PE data, and the available data for the evaluation of irrigation and drainage networks system can be suggested.

Robust Feature Parameter for Implementation of Speech Recognizer Using Support Vector Machines (SVM음성인식기 구현을 위한 강인한 특징 파라메터)

  • 김창근;박정원;허강인
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.3
    • /
    • pp.195-200
    • /
    • 2004
  • In this paper we propose effective speech recognizer through two recognition experiments. In general, SVM is classification method which classify two class set by finding voluntary nonlinear boundary in vector space and possesses high classification performance under few training data number. In this paper we compare recognition performance of HMM and SVM at training data number and investigate recognition performance of each feature parameter while changing feature space of MFCC using Independent Component Analysis(ICA) and Principal Component Analysis(PCA). As a result of experiment, recognition performance of SVM is better than 1:.um under few training data number, and feature parameter by ICA showed the highest recognition performance because of superior linear classification.

On the Use of Adaptive Weights for the F-Norm Support Vector Machine

  • Bang, Sung-Wan;Jhun, Myoung-Shic
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.829-835
    • /
    • 2012
  • When the input features are generated by factors in a classification problem, it is more meaningful to identify important factors, rather than individual features. The $F_{\infty}$-norm support vector machine(SVM) has been developed to perform automatic factor selection in classification. However, the $F_{\infty}$-norm SVM may suffer from estimation inefficiency and model selection inconsistency because it applies the same amount of shrinkage to each factor without assessing its relative importance. To overcome such a limitation, we propose the adaptive $F_{\infty}$-norm ($AF_{\infty}$-norm) SVM, which penalizes the empirical hinge loss by the sum of the adaptively weighted factor-wise $L_{\infty}$-norm penalty. The $AF_{\infty}$-norm SVM computes the weights by the 2-norm SVM estimator and can be formulated as a linear programming(LP) problem which is similar to the one of the $F_{\infty}$-norm SVM. The simulation studies show that the proposed $AF_{\infty}$-norm SVM improves upon the $F_{\infty}$-norm SVM in terms of classification accuracy and factor selection performance.

Sentiment Analysis using Latent Structural SVM (잠재 구조적 SVM을 활용한 감성 분석기)

  • Yang, Seung-Won;Lee, Changki
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.5
    • /
    • pp.240-245
    • /
    • 2016
  • In this study, comments on restaurants, movies, and mobile devices, as well as tweet messages regardless of specific domains were analyzed for sentimental information content. We proposed a system for extraction of objects (or aspects) and opinion words from each sentence and the subsequent evaluation. For the sentiment analysis, we conducted a comparative evaluation between the Structural SVM algorithm and the Latent Structural SVM. As a result, the latter showed better performance and was able to extract objects/aspects and opinion words using VP/NP analyzed by the dependency parser tree. Lastly, we also developed and evaluated the sentiment detector model for use in practical services.

Least-Squares Support Vector Machine for Regression Model with Crisp Inputs-Gaussian Fuzzy Output

  • Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.507-513
    • /
    • 2004
  • Least-squares support vector machine (LS-SVM) has been very successful in pattern recognition and function estimation problems for crisp data. In this paper, we propose LS-SVM approach to evaluating fuzzy regression model with multiple crisp inputs and a Gaussian fuzzy output. The proposed algorithm here is model-free method in the sense that we do not need assume the underlying model function. Experimental result is then presented which indicate the performance of this algorithm.

  • PDF

Heart Sound-Based Cardiac Disorder Classifiers Using an SVM to Combine HMM and Murmur Scores (SVM을 이용하여 HMM과 심잡음 점수를 결합한 심음 기반 심장질환 분류기)

  • Kwak, Chul;Kwon, Oh-Wook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.3
    • /
    • pp.149-157
    • /
    • 2011
  • In this paper, we propose a new cardiac disorder classification method using an support vector machine (SVM) to combine hidden Markov model (HMM) and murmur existence information. Using cepstral features and the HMM Viterbi algorithm, we segment input heart sound signals into HMM states for each cardiac disorder model and compute log-likelihood (score) for every state in the model. To exploit the temporal position characteristics of murmur signals, we divide the input signals into two subbands and compute murmur probability of every subband of each frame, and obtain the murmur score for each state by using the state segmentation information obtained from the Viterbi algorithm. With an input vector containing the HMM state scores and the murmur scores for all cardiac disorder models, SVM finally decides the cardiac disorder category. In cardiac disorder classification experimental results, the proposed method shows the relatively improvement rate of 20.4 % compared to the HMM-based classifier with the conventional cepstral features.