• Title/Summary/Keyword: Vector optimization

Search Result 474, Processing Time 0.03 seconds

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Implementation of Unsupervised Nonlinear Classifier with Binary Harmony Search Algorithm (Binary Harmony Search 알고리즘을 이용한 Unsupervised Nonlinear Classifier 구현)

  • Lee, Tae-Ju;Park, Seung-Min;Ko, Kwang-Eun;Sung, Won-Ki;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.4
    • /
    • pp.354-359
    • /
    • 2013
  • In this paper, we suggested the method for implementation of unsupervised nonlinear classification using Binary Harmony Search (BHS) algorithm, which is known as a optimization algorithm. Various algorithms have been suggested for classification of feature vectors from the process of machine learning for pattern recognition or EEG signal analysis processing. Supervised learning based support vector machine or fuzzy c-mean (FCM) based on unsupervised learning have been used for classification in the field. However, conventional methods were hard to apply nonlinear dataset classification or required prior information for supervised learning. We solved this problems with proposed classification method using heuristic approach which took the minimal Euclidean distance between vectors, then we assumed them as same class and the others were another class. For the comparison, we used FCM, self-organizing map (SOM) based on artificial neural network (ANN). KEEL machine learning datset was used for simulation. We concluded that proposed method was superior than other algorithms.

Development of Classification Model for hERG Ion Channel Inhibitors Using SVM Method (SVM 방법을 이용한 hERG 이온 채널 저해제 예측모델 개발)

  • Gang, Sin-Moon;Kim, Han-Jo;Oh, Won-Seok;Kim, Sun-Young;No, Kyoung-Tai;Nam, Ky-Youb
    • Journal of the Korean Chemical Society
    • /
    • v.53 no.6
    • /
    • pp.653-662
    • /
    • 2009
  • Developing effective tools for predicting absorption, distribution, metabolism, excretion properties and toxicity (ADME/T) of new chemical entities in the early stage of drug design is one of the most important tasks in drug discovery and development today. As one of these attempts, support vector machines (SVM) has recently been exploited for the prediction of ADME/T related properties. However, two problems in SVM modeling, i.e. feature selection and parameters setting, are still far from solved. The two problems have been shown to be crucial to the efficiency and accuracy of SVM classification. In particular, the feature selection and optimal SVM parameters setting influence each other, which indicates that they should be dealt with simultaneously. In this account, we present an integrated practical solution, in which genetic-based algorithm (GA) is used for feature selection and grid search (GS) method for parameters optimization. hERG ion-channel inhibitor classification models of ADME/T related properties has been built for assessing and testing the proposed GA-GS-SVM. We generated 6 different models that are 3 different single models and 3 different ensemble models using training set - 1891 compounds and validated with external test set - 175 compounds. We compared single model with ensemble model to solve data imbalance problems. It was able to improve accuracy of prediction to use ensemble model.

Mesh Simplification for Preservation of Characteristic Features using Surface Orientation (표면의 방향정보를 고려한 메쉬의 특성정보의 보존)

  • 고명철;최윤철
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.4
    • /
    • pp.458-467
    • /
    • 2002
  • There has been proposed many simplification algorithms for effectively decreasing large-volumed polygonal surface data. These algorithms apply their own cost function for collapse to one of fundamental simplification unit, such as vertex, edge and triangle, and minimize the simplification error occurred in each simplification steps. Most of cost functions adopted in existing works use the error estimation method based on distance optimization. Unfortunately, it is hard to define the local characteristics of surface data using distance factor alone, which is basically scalar component. Therefore, the algorithms cannot preserve the characteristic features in surface areas with high curvature and, consequently, loss the detailed shape of original mesh in high simplification ratio. In this paper, we consider the vector component, such as surface orientation, as one of factors for cost function. The surface orientation is independent upon scalar component, distance value. This means that we can reconsider whether or not to preserve them as the amount of vector component, although they are elements with low scalar values. In addition, we develop a simplification algorithm based on half-edge collapse manner, which use the proposed cost function as the criterion for removing elements. In half-edge collapse, using one of endpoints in the edge represents a new vertex after collapse operation. The approach is memory efficient and effectively applicable to the rendering system requiring real-time transmission of large-volumed surface data.

  • PDF

Development of TaqMan Quantitative PCR Assays for Duplex Detection of Dirofilaria immitis COI and Dog GAPDH from Infected Dog Blood (심장사상충에 감염된 개 혈액에서 Dirofilaria immitis의 COI와 개의 GAPDH를 이중 검출하기 위한 정량적 TaqMan PCR 분석법의 개발)

  • Oh, In Young;Kim, Kyung Tae;Gwon, Sun-Yeong;Sung, Ho Joong
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.51 no.1
    • /
    • pp.64-70
    • /
    • 2019
  • Dirofilaria immitis (D. immitis) is a filarial nematode that causes cardiopulmonary dirofilariasis in dogs. In the late stages of infection, infected dogs show one or more symptoms and advanced heart disorder with perivascular inflammation. To detect D. immitis specifically and efficiently in the early stages of infection, a duplex TaqMan qPCR assay was developed based on previous studies using primers and probes specialized to detect D. immitis cytochrome c oxidase subunit I (COI) and dog glyceraldehyde-3-phosphate dehydrogenase (GAPDH). As positive controls, plasmid DNAs were constructed from D. immitis COI or dog GAPDH and a TA-cloning vector. Simplex and duplex TaqMan qPCR assays were performed using the specific primers, probes, and genomic or plasmid DNA. The duplex reaction developed could detect D. immitis COI and dog GAPDH in the same sample simultaneously after optimization of the primer concentrations. The limit of detection was 25 copies for the simplex and duplex assays, and both showed good linearity, high sensitivity, and excellent PCR efficiency. The duplex assays for pathogen detection reduce the costs, labor, and time compared to simplex reactions. Therefore, the duplex TaqMan qPCR assay developed herein will allow efficient D. immitis detection and quantification from a large number of samples simultaneously.

Unsupervised Non-rigid Registration Network for 3D Brain MR images (3차원 뇌 자기공명 영상의 비지도 학습 기반 비강체 정합 네트워크)

  • Oh, Donggeon;Kim, Bohyoung;Lee, Jeongjin;Shin, Yeong-Gil
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.5
    • /
    • pp.64-74
    • /
    • 2019
  • Although a non-rigid registration has high demands in clinical practice, it has a high computational complexity and it is very difficult for ensuring the accuracy and robustness of registration. This study proposes a method of applying a non-rigid registration to 3D magnetic resonance images of brain in an unsupervised learning environment by using a deep-learning network. A feature vector between two images is produced through the network by receiving both images from two different patients as inputs and it transforms the target image to match the source image by creating a displacement vector field. The network is designed based on a U-Net shape so that feature vectors that consider all global and local differences between two images can be constructed when performing the registration. As a regularization term is added to a loss function, a transformation result similar to that of a real brain movement can be obtained after the application of trilinear interpolation. This method enables a non-rigid registration with a single-pass deformation by only receiving two arbitrary images as inputs through an unsupervised learning. Therefore, it can perform faster than other non-learning-based registration methods that require iterative optimization processes. Our experiment was performed with 3D magnetic resonance images of 50 human brains, and the measurement result of the dice similarity coefficient confirmed an approximately 16% similarity improvement by using our method after the registration. It also showed a similar performance compared with the non-learning-based method, with about 10,000 times speed increase. The proposed method can be used for non-rigid registration of various kinds of medical image data.

Comparison of Prediction Accuracy Between Classification and Convolution Algorithm in Fault Diagnosis of Rotatory Machines at Varying Speed (회전수가 변하는 기기의 고장진단에 있어서 특성 기반 분류와 합성곱 기반 알고리즘의 예측 정확도 비교)

  • Moon, Ki-Yeong;Kim, Hyung-Jin;Hwang, Se-Yun;Lee, Jang Hyun
    • Journal of Navigation and Port Research
    • /
    • v.46 no.3
    • /
    • pp.280-288
    • /
    • 2022
  • This study examined the diagnostics of abnormalities and faults of equipment, whose rotational speed changes even during regular operation. The purpose of this study was to suggest a procedure that can properly apply machine learning to the time series data, comprising non-stationary characteristics as the rotational speed changes. Anomaly and fault diagnosis was performed using machine learning: k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), and Random Forest. To compare the diagnostic accuracy, an autoencoder was used for anomaly detection and a convolution based Conv1D was additionally used for fault diagnosis. Feature vectors comprising statistical and frequency attributes were extracted, and normalization & dimensional reduction were applied to the extracted feature vectors. Changes in the diagnostic accuracy of machine learning according to feature selection, normalization, and dimensional reduction are explained. The hyperparameter optimization process and the layered structure are also described for each algorithm. Finally, results show that machine learning can accurately diagnose the failure of a variable-rotation machine under the appropriate feature treatment, although the convolution algorithms have been widely applied to the considered problem.

Optimization of Uneven Margin SVM to Solve Class Imbalance in Bankruptcy Prediction (비대칭 마진 SVM 최적화 모델을 이용한 기업부실 예측모형의 범주 불균형 문제 해결)

  • Sung Yim Jo;Myoung Jong Kim
    • Information Systems Review
    • /
    • v.24 no.4
    • /
    • pp.23-40
    • /
    • 2022
  • Although Support Vector Machine(SVM) has been used in various fields such as bankruptcy prediction model, the hyperplane learned by SVM in class imbalance problem can be severely skewed toward minority class and has a negative impact on performance because the area of majority class is expanded while the area of minority class is invaded. This study proposed optimized uneven margin SVM(OPT-UMSVM) combining threshold moving or post scaling method with UMSVM to cope with the limitation of the traditional even margin SVM(EMSVM) in class imbalance problem. OPT-UMSVM readjusted the skewed hyperplane to the majority class and had better generation ability than EMSVM improving the sensitivity of minority class and calculating the optimized performance. To validate OPT-UMSVM, 10-fold cross validations were performed on five sub-datasets with different imbalance ratio values. Empirical results showed two main findings. First, UMSVM had a weak effect on improving the performance of EMSVM in balanced datasets, but it greatly outperformed EMSVM in severely imbalanced datasets. Second, compared to EMSVM and conventional UMSVM, OPT-UMSVM had better performance in both balanced and imbalanced datasets and showed a significant difference performance especially in severely imbalanced datasets.

Extracting Typical Group Preferences through User-Item Optimization and User Profiles in Collaborative Filtering System (사용자-상품 행렬의 최적화와 협력적 사용자 프로파일을 이용한 그룹의 대표 선호도 추출)

  • Ko Su-Jeong
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.7
    • /
    • pp.581-591
    • /
    • 2005
  • Collaborative filtering systems have problems involving sparsity and the provision of recommendations by making correlations between only two users' preferences. These systems recommend items based only on the preferences without taking in to account the contents of the items. As a result, the accuracy of recommendations depends on the data from user-rated items. When users rate items, it can be expected that not all users ran do so earnestly. This brings down the accuracy of recommendations. This paper proposes a collaborative recommendation method for extracting typical group preferences using user-item matrix optimization and user profiles in collaborative tittering systems. The method excludes unproven users by using entropy based on data from user-rated items and groups users into clusters after generating user profiles, and then extracts typical group preferences. The proposed method generates collaborative user profiles by using association word mining to reflect contents as well as preferences of items and groups users into clusters based on the profiles by using the vector space model and the K-means algorithm. To compensate for the shortcoming of providing recommendations using correlations between only two user preferences, the proposed method extracts typical preferences of groups using the entropy theory The typical preferences are extracted by combining user entropies with item preferences. The recommender system using typical group preferences solves the problem caused by recommendations based on preferences rated incorrectly by users and reduces time for retrieving the most similar users in groups.

A Study on Optimization of Perovskite Solar Cell Light Absorption Layer Thin Film Based on Machine Learning (머신러닝 기반 페로브스카이트 태양전지 광흡수층 박막 최적화를 위한 연구)

  • Ha, Jae-jun;Lee, Jun-hyuk;Oh, Ju-young;Lee, Dong-geun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.7
    • /
    • pp.55-62
    • /
    • 2022
  • The perovskite solar cell is an active part of research in renewable energy fields such as solar energy, wind, hydroelectric power, marine energy, bioenergy, and hydrogen energy to replace fossil fuels such as oil, coal, and natural gas, which will gradually disappear as power demand increases due to the increase in use of the Internet of Things and Virtual environments due to the 4th industrial revolution. The perovskite solar cell is a solar cell device using an organic-inorganic hybrid material having a perovskite structure, and has advantages of replacing existing silicon solar cells with high efficiency, low cost solutions, and low temperature processes. In order to optimize the light absorption layer thin film predicted by the existing empirical method, reliability must be verified through device characteristics evaluation. However, since it costs a lot to evaluate the characteristics of the light-absorbing layer thin film device, the number of tests is limited. In order to solve this problem, the development and applicability of a clear and valid model using machine learning or artificial intelligence model as an auxiliary means for optimizing the light absorption layer thin film are considered infinite. In this study, to estimate the light absorption layer thin-film optimization of perovskite solar cells, the regression models of the support vector machine's linear kernel, R.B.F kernel, polynomial kernel, and sigmoid kernel were compared to verify the accuracy difference for each kernel function.