• Title/Summary/Keyword: SVM (Support Vector Machine)

Search Result 1,256, Processing Time 0.028 seconds

Increasing Splicing Site Prediction by Training Gene Set Based on Species

  • Ahn, Beunguk;Abbas, Elbashir;Park, Jin-Ah;Choi, Ho-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.11
    • /
    • pp.2784-2799
    • /
    • 2012
  • Biological data have been increased exponentially in recent years, and analyzing these data using data mining tools has become one of the major issues in the bioinformatics research community. This paper focuses on the protein construction process in higher organisms where the deoxyribonucleic acid, or DNA, sequence is filtered. In the process, "unmeaningful" DNA sub-sequences (called introns) are removed, and their meaningful counterparts (called exons) are retained. Accurate recognition of the boundaries between these two classes of sub-sequences, however, is known to be a difficult problem. Conventional approaches for recognizing these boundaries have sought for solely enhancing machine learning techniques, while inherent nature of the data themselves has been overlooked. In this paper we present an approach which makes use of the data attributes inherent to species in order to increase the accuracy of the boundary recognition. For experimentation, we have taken the data sets for four different species from the University of California Santa Cruz (UCSC) data repository, divided the data sets based on the species types, then trained a preprocessed version of the data sets on neural network(NN)-based and support vector machine(SVM)-based classifiers. As a result, we have observed that each species has its own specific features related to the splice sites, and that it implies there are related distances among species. To conclude, dividing the training data set based on species would increase the accuracy of predicting splicing junction and propose new insight to the biological research.

Development of 3D Crop Segmentation Model in Open-field Based on Supervised Machine Learning Algorithm (지도학습 알고리즘 기반 3D 노지 작물 구분 모델 개발)

  • Jeong, Young-Joon;Lee, Jong-Hyuk;Lee, Sang-Ik;Oh, Bu-Yeong;Ahmed, Fawzy;Seo, Byung-Hun;Kim, Dong-Su;Seo, Ye-Jin;Choi, Won
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.64 no.1
    • /
    • pp.15-26
    • /
    • 2022
  • 3D open-field farm model developed from UAV (Unmanned Aerial Vehicle) data could make crop monitoring easier, also could be an important dataset for various fields like remote sensing or precision agriculture. It is essential to separate crops from the non-crop area because labeling in a manual way is extremely laborious and not appropriate for continuous monitoring. We, therefore, made a 3D open-field farm model based on UAV images and developed a crop segmentation model using a supervised machine learning algorithm. We compared performances from various models using different data features like color or geographic coordinates, and two supervised learning algorithms which are SVM (Support Vector Machine) and KNN (K-Nearest Neighbors). The best approach was trained with 2-dimensional data, ExGR (Excess of Green minus Excess of Red) and z coordinate value, using KNN algorithm, whose accuracy, precision, recall, F1 score was 97.85, 96.51, 88.54, 92.35% respectively. Also, we compared our model performance with similar previous work. Our approach showed slightly better accuracy, and it detected the actual crop better than the previous approach, while it also classified actual non-crop points (e.g. weeds) as crops.

QSPR analysis for predicting heat of sublimation of organic compounds (유기화합물의 승화열 예측을 위한 QSPR분석)

  • Park, Yu Sun;Lee, Jong Hyuk;Park, Han Woong;Lee, Sung Kwang
    • Analytical Science and Technology
    • /
    • v.28 no.3
    • /
    • pp.187-195
    • /
    • 2015
  • The heat of sublimation (HOS) is an essential parameter used to resolve environmental problems in the transfer of organic contaminants to the atmosphere and to assess the risk of toxic chemicals. The experimental measurement of the heat of sublimation is time-consuming, expensive, and complicated. In this study, quantitative structural property relationships (QSPR) were used to develop a simple and predictive model for measuring the heat of sublimation of organic compounds. The population-based forward selection method was applied to select an informative subset of descriptors of learning algorithms, such as by using multiple linear regression (MLR) and the support vector machine (SVM) method. Each individual model and consensus model was evaluated by internal validation using the bootstrap method and y-randomization. The predictions of the performance of the external test set were improved by considering their applicability to the domain. Based on the results of the MLR model, we showed that the heat of sublimation was related to dispersion, H-bond, electrostatic forces, and the dipole-dipole interaction between inter-molecules.

Breaking character and natural image based CAPTCHA using feature classification (특징 분리를 통한 자연 배경을 지닌 글자 기반 CAPTCHA 공격)

  • Kim, Jaehwan;Kim, Suah;Kim, Hyoung Joong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.5
    • /
    • pp.1011-1019
    • /
    • 2015
  • CAPTCHA(Completely Automated Public Turing test to tell Computers and Humans Apart) is a test used in computing to distinguish whether or not the user is computer or human. Many web sites mostly use the character-based CAPTCHA consisting of digits and characters. Recently, with the development of OCR technology, simple character-based CAPTCHA are broken quite easily. As an alternative, many web sites add noise to make it harder for recognition. In this paper, we analyzed the most recent CAPTCHA, which incorporates the addition of the natural images to obfuscate the characters. We proposed an efficient method using support vector machine to separate the characters from the background image and use convolutional neural network to recognize each characters. As a result, 368 out of 1000 CAPTCHAs were correctly identified, it was demonstrated that the current CAPTCHA is not safe.

No-reference Image Quality Assessment With A Gradient-induced Dictionary

  • Li, Leida;Wu, Dong;Wu, Jinjian;Qian, Jiansheng;Chen, Beijing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.1
    • /
    • pp.288-307
    • /
    • 2016
  • Image distortions are typically characterized by degradations of structures. Dictionaries learned from natural images can capture the underlying structures in images, which are important for image quality assessment (IQA). This paper presents a general-purpose no-reference image quality metric using a GRadient-Induced Dictionary (GRID). A dictionary is first constructed based on gradients of natural images using K-means clustering. Then image features are extracted using the dictionary based on Euclidean-norm coding and max-pooling. A distortion classification model and several distortion-specific quality regression models are trained using the support vector machine (SVM) by combining image features with distortion types and subjective scores, respectively. To evaluate the quality of a test image, the distortion classification model is used to determine the probabilities that the image belongs to different kinds of distortions, while the regression models are used to predict the corresponding distortion-specific quality scores. Finally, an overall quality score is computed as the probability-weighted distortion-specific quality scores. The proposed metric can evaluate image quality accurately and efficiently using a small dictionary. The performance of the proposed method is verified on public image quality databases. Experimental results demonstrate that the proposed metric can generate quality scores highly consistent with human perception, and it outperforms the state-of-the-arts.

Large-Scale Text Classification with Deep Neural Networks (깊은 신경망 기반 대용량 텍스트 데이터 분류 기술)

  • Jo, Hwiyeol;Kim, Jin-Hwa;Kim, Kyung-Min;Chang, Jeong-Ho;Eom, Jae-Hong;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.5
    • /
    • pp.322-327
    • /
    • 2017
  • The classification problem in the field of Natural Language Processing has been studied for a long time. Continuing forward with our previous research, which classifies large-scale text using Convolutional Neural Networks (CNN), we implemented Recurrent Neural Networks (RNN), Long-Short Term Memory (LSTM) and Gated Recurrent Units (GRU). The experiment's result revealed that the performance of classification algorithms was Multinomial Naïve Bayesian Classifier < Support Vector Machine (SVM) < LSTM < CNN < GRU, in order. The result can be interpreted as follows: First, the result of CNN was better than LSTM. Therefore, the text classification problem might be related more to feature extraction problem than to natural language understanding problems. Second, judging from the results the GRU showed better performance in feature extraction than LSTM. Finally, the result that the GRU was better than CNN implies that text classification algorithms should consider feature extraction and sequential information. We presented the results of fine-tuning in deep neural networks to provide some intuition regard natural language processing to future researchers.

On-line Signature Verification Using Fusion Model Based on Segment Matching and HMM (구간 분할 및 HMM 기반 융합 모델에 의한 온라인 서명 검증)

  • Yang Dong Hwa;Lee Dae-Jong;Chun Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.1
    • /
    • pp.12-17
    • /
    • 2005
  • The segment matching method shows better performance than the global and points-based methods to compare reference signature with an input signature. However, the segment-to-segment matching method has the problem of decreasing recognition rate according to the variation of partitioning points. This paper proposes a fusion model based on the segment matching and HMM to construct a more reliable authentic system. First, a segment matching classifier is designed by conventional technique to calculate matching values lot dynamic information of signatures. And also, a novel HMM classifier is constructed by using the principal component analysis to calculate matching values for static information of signatures. Finally, SVM classifier is adopted to effectively combine two independent classifiers. From the various experiments, we find that the proposed method shows better performance than the conventional segment matching method.

Pose and Expression Invariant Alignment based Multi-View 3D Face Recognition

  • Ratyal, Naeem;Taj, Imtiaz;Bajwa, Usama;Sajid, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.10
    • /
    • pp.4903-4929
    • /
    • 2018
  • In this study, a fully automatic pose and expression invariant 3D face alignment algorithm is proposed to handle frontal and profile face images which is based on a two pass course to fine alignment strategy. The first pass of the algorithm coarsely aligns the face images to an intrinsic coordinate system (ICS) through a single 3D rotation and the second pass aligns them at fine level using a minimum nose tip-scanner distance (MNSD) approach. For facial recognition, multi-view faces are synthesized to exploit real 3D information and test the efficacy of the proposed system. Due to optimal separating hyper plane (OSH), Support Vector Machine (SVM) is employed in multi-view face verification (FV) task. In addition, a multi stage unified classifier based face identification (FI) algorithm is employed which combines results from seven base classifiers, two parallel face recognition algorithms and an exponential rank combiner, all in a hierarchical manner. The performance figures of the proposed methodology are corroborated by extensive experiments performed on four benchmark datasets: GavabDB, Bosphorus, UMB-DB and FRGC v2.0. Results show mark improvement in alignment accuracy and recognition rates. Moreover, a computational complexity analysis has been carried out for the proposed algorithm which reveals its superiority in terms of computational efficiency as well.

Classification of meteorological state and spatial correlation analysis of precipitation in Jeonbuk province (전라북도 강수량의 기상특성 분류 및 공간상관성 분석)

  • Lee, Jeong-Ju;Kwon, Hyun-Han;Hong, Min;Lee, Jong-Seok
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2011.05a
    • /
    • pp.404-404
    • /
    • 2011
  • 최근 기상변동성 증가와 극치수문사상의 발생빈도 증가로 인한 기상재해가 빈번하게 일어나고 있다. 이러한 기상현상으로 인한 재해의 예방을 위해서 사전에 위험을 인지하고 그 규모를 예측할 수 있는 여러 기법들이 기상레이더 또는 수치예보자료 등을 이용하여 개발 및 적용되고 있다. 이 과정에서 해결해야 할 여러 문제점들이 있는데, 우선 수치예보자료 또는 기상레이더자료를 종관기상관측소 및 자동기상관측지점의 지상관측 강수량과 연계하여 평가하는 과정이 필요하고, 현재시점에 형성되어 있는 강우장의 공간 이동 예측 기법이 확보되어야 할 것이다. 전북지역은 게릴라성 집중호우가 빈번한 산악형 강수와 산지유역의 급한 하천경사가 맞물려 인명 및 재산피해가 매년 발생하고 있으며, 과거 돌발홍수가 발생한 사례가 있어 이상기후 및 기후변화로 인한 홍수 위험도가 커질 것으로 전망되고 있다. 본 연구는 전라북도의 기상재해 예측모형 개발을 위한 사전 분석과정으로 전라북도지역에서 관측된 기존의 대규모 강수사상을 이용한 강수사상의 특성 분류 및 관측소간 공간상관성을 분석하는데 목적을 두고 있다. 강수사상의 특성분류를 통해 강수 발생형태에 따른 기상학적 영향인자, 강수의 발생량 및 이동특성 예측의 정도를 향상시킬 수 있으며, 분류 기법으로 SVM(support vector machine)을 이용한 자동분류를 적용한다. 또한 관측소간 공간상관성 분석을 위하여 각 관측소 강수량간의 조건부 확률을 이용한다. 예로써 부안관측소에 강수가 발 생했을 때, 부안관측소의 강수량 조건에 의한 전주관측소 강수량 확률을 다음과 같이 구성할 수 있다. �揚滑斂�수량�咀刮활�수량��. 공간상관성 분석과정에서 관측소간 강수 이동시간에 따른 강수 발생 시간의 차이 또한 고려하며, 과거 기상관측 자료의 분석을 통해 전라북도지역의 관측소간 강수발생의 공간적 상관성을 규명하고, 단기예측 모델 개발을 위한 기초자료로 활용할 수 있을 것이다. 또한, 기후변화시나리오에 의한 미래 강수량의 지역적 상세화 과정에도 본 연구를 통한 결과를 이용할 수 있을 것이라 판단된다.

  • PDF

Fast On-Road Vehicle Detection Using Reduced Multivariate Polynomial Classifier (축소 다변수 다항식 분류기를 이용한 고속 차량 검출 방법)

  • Kim, Joong-Rock;Yu, Sun-Jin;Toh, Kar-Ann;Kim, Do-Hoon;Lee, Sang-Youn
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.8A
    • /
    • pp.639-647
    • /
    • 2012
  • Vision-based on-road vehicle detection is one of the key techniques in automotive driver assistance systems. However, due to the huge within-class variability in vehicle appearance and environmental changes, it remains a challenging task to develop an accurate and reliable detection system. In general, a vehicle detection system consists of two steps. The candidate locations of vehicles are found in the Hypothesis Generation (HG) step, and the detected locations in the HG step are verified in the Hypothesis Verification (HV) step. Since the final decision is made in the HV step, the HV step is crucial for accurate detection. In this paper, we propose using a reduced multivariate polynomial pattern classifier (RM) for the HV step. Our experimental results show that the RM classifier outperforms the well-known Support Vector Machine (SVM) classifier, particularly in terms of the fast decision speed, which is suitable for real-time implementation.