• Title/Summary/Keyword: Support Vector Machines, SVMs

Search Result 91, Processing Time 0.024 seconds

Hybrid CNN-SVM Based Seed Purity Identification and Classification System

  • Suganthi, M;Sathiaseelan, J.G.R.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.271-281
    • /
    • 2022
  • Manual seed classification challenges can be overcome using a reliable and autonomous seed purity identification and classification technique. It is a highly practical and commercially important requirement of the agricultural industry. Researchers can create a new data mining method with improved accuracy using current machine learning and artificial intelligence approaches. Seed classification can help with quality making, seed quality controller, and impurity identification. Seeds have traditionally been classified based on characteristics such as colour, shape, and texture. Generally, this is done by experts by visually examining each model, which is a very time-consuming and tedious task. This approach is simple to automate, making seed sorting far more efficient than manually inspecting them. Computer vision technologies based on machine learning (ML), symmetry, and, more specifically, convolutional neural networks (CNNs) have been widely used in related fields, resulting in greater labour efficiency in many cases. To sort a sample of 3000 seeds, KNN, SVM, CNN and CNN-SVM hybrid classification algorithms were used. A model that uses advanced deep learning techniques to categorise some well-known seeds is included in the proposed hybrid system. In most cases, the CNN-SVM model outperformed the comparable SVM and CNN models, demonstrating the effectiveness of utilising CNN-SVM to evaluate data. The findings of this research revealed that CNN-SVM could be used to analyse data with promising results. Future study should look into more seed kinds to expand the use of CNN-SVMs in data processing.

A Study on the Development of Adaptive Learning System through EEG-based Learning Achievement Prediction

  • Jinwoo, KIM;Hosung, WOO
    • Fourth Industrial Review
    • /
    • v.3 no.1
    • /
    • pp.13-20
    • /
    • 2023
  • Purpose - By designing a PEF(Personalized Education Feedback) system for real-time prediction of learning achievement and motivation through real-time EEG analysis of learners, this system provides some modules of a personalized adaptive learning system. By applying these modules to e-learning and offline learning, they motivate learners and improve the quality of learning progress and effective learning outcomes can be achieved for immersive self-directed learning Research design, data, and methodology - EEG data were collected simultaneously as the English test was given to the experimenters, and the correlation between the correct answer result and the EEG data was learned with a machine learning algorithm and the predictive model was evaluated.. Result - In model performance evaluation, both artificial neural networks(ANNs) and support vector machines(SVMs) showed high accuracy of more than 91%. Conclusion - This research provides some modules of personalized adaptive learning systems that can more efficiently complete by designing a PEF system for real-time learning achievement prediction and learning motivation through an adaptive learning system based on real-time EEG analysis of learners. The implication of this initial research is to verify hypothetical situations for the development of an adaptive learning system through EEG analysis-based learning achievement prediction.

A Study on Evaluation of e-learners' Concentration by using Machine Learning (머신러닝을 이용한 이러닝 학습자 집중도 평가 연구)

  • Jeong, Young-Sang;Joo, Min-Sung;Cho, Nam-Wook
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.18 no.4
    • /
    • pp.67-75
    • /
    • 2022
  • Recently, e-learning has been attracting significant attention due to COVID-19. However, while e-learning has many advantages, it has disadvantages as well. One of the main disadvantages of e-learning is that it is difficult for teachers to continuously and systematically monitor learners. Although services such as personalized e-learning are provided to compensate for the shortcoming, systematic monitoring of learners' concentration is insufficient. This study suggests a method to evaluate the learner's concentration by applying machine learning techniques. In this study, emotion and gaze data were extracted from 184 videos of 92 participants. First, the learners' concentration was labeled by experts. Then, statistical-based status indicators were preprocessed from the data. Random Forests (RF), Support Vector Machines (SVMs), Multilayer Perceptron (MLP), and an ensemble model have been used in the experiment. Long Short-Term Memory (LSTM) has also been used for comparison. As a result, it was possible to predict e-learners' concentration with an accuracy of 90.54%. This study is expected to improve learners' immersion by providing a customized educational curriculum according to the learner's concentration level.

Text Classification Using Parallel Word-level and Character-level Embeddings in Convolutional Neural Networks

  • Geonu Kim;Jungyeon Jang;Juwon Lee;Kitae Kim;Woonyoung Yeo;Jong Woo Kim
    • Asia pacific journal of information systems
    • /
    • v.29 no.4
    • /
    • pp.771-788
    • /
    • 2019
  • Deep learning techniques such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) show superior performance in text classification than traditional approaches such as Support Vector Machines (SVMs) and Naïve Bayesian approaches. When using CNNs for text classification tasks, word embedding or character embedding is a step to transform words or characters to fixed size vectors before feeding them into convolutional layers. In this paper, we propose a parallel word-level and character-level embedding approach in CNNs for text classification. The proposed approach can capture word-level and character-level patterns concurrently in CNNs. To show the usefulness of proposed approach, we perform experiments with two English and three Korean text datasets. The experimental results show that character-level embedding works better in Korean and word-level embedding performs well in English. Also the experimental results reveal that the proposed approach provides better performance than traditional CNNs with word-level embedding or character-level embedding in both Korean and English documents. From more detail investigation, we find that the proposed approach tends to perform better when there is relatively small amount of data comparing to the traditional embedding approaches.

Terms Based Sentiment Classification for Online Review Using Support Vector Machine (Support Vector Machine을 이용한 온라인 리뷰의 용어기반 감성분류모형)

  • Lee, Taewon;Hong, Taeho
    • Information Systems Review
    • /
    • v.17 no.1
    • /
    • pp.49-64
    • /
    • 2015
  • Customer reviews which include subjective opinions for the product or service in online store have been generated rapidly and their influence on customers has become immense due to the widespread usage of SNS. In addition, a number of studies have focused on opinion mining to analyze the positive and negative opinions and get a better solution for customer support and sales. It is very important to select the key terms which reflected the customers' sentiment on the reviews for opinion mining. We proposed a document-level terms-based sentiment classification model by select in the optimal terms with part of speech tag. SVMs (Support vector machines) are utilized to build a predictor for opinion mining and we used the combination of POS tag and four terms extraction methods for the feature selection of SVM. To validate the proposed opinion mining model, we applied it to the customer reviews on Amazon. We eliminated the unmeaning terms known as the stopwords and extracted the useful terms by using part of speech tagging approach after crawling 80,000 reviews. The extracted terms gained from document frequency, TF-IDF, information gain, chi-squared statistic were ranked and 20 ranked terms were used to the feature of SVM model. Our experimental results show that the performance of SVM model with four POS tags is superior to the benchmarked model, which are built by extracting only adjective terms. In addition, the SVM model based on Chi-squared statistic for opinion mining shows the most superior performance among SVM models with 4 different kinds of terms extraction method. Our proposed opinion mining model is expected to improve customer service and gain competitive advantage in online store.

Efficient Implementation of SVM-Based Speech/Music Classifier by Utilizing Temporal Locality (시간적 근접성 향상을 통한 효율적인 SVM 기반 음성/음악 분류기의 구현 방법)

  • Lim, Chung-Soo;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.149-156
    • /
    • 2012
  • Support vector machines (SVMs) are well known for their pattern recognition capability, but proper care should be taken to alleviate their inherent implementation cost resulting from high computational intensity and memory requirement, especially in embedded systems where only limited resources are available. Since the memory requirement determined by the dimensionality and the number of support vectors is generally too high for a cache in embedded systems to accomodate, frequent accesses to the main memory occur inevitably whenever the cache is not able to provide requested data to the processor. These frequent accesses to the main memory result in overall performance degradation and increased energy consumption because a memory access typically takes longer and consumes more energy than a cache access or a register access. In this paper, we propose a technique that reduces the number of main memory accesses by optimizing the data access pattern of the SVM-based classifier in such a way that the temporal locality of the accesses increases, fully utilizing data loaded into the processor chip. With experiments, we confirm the enhancement made by the proposed technique in terms of the number of memory accesses, overall execution time, and energy consumption.

Spatiotemporal Removal of Text in Image Sequences (비디오 영상에서 시공간적 문자영역 제거방법)

  • Lee, Chang-Woo;Kang, Hyun;Jung, Kee-Chul;Kim, Hang-Joon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.2
    • /
    • pp.113-130
    • /
    • 2004
  • Most multimedia data contain text to emphasize the meaning of the data, to present additional explanations about the situation, or to translate different languages. But, the left makes it difficult to reuse the images, and distorts not only the original images but also their meanings. Accordingly, this paper proposes a support vector machines (SVMs) and spatiotemporal restoration-based approach for automatic text detection and removal in video sequences. Given two consecutive frames, first, text regions in the current frame are detected by an SVM-based texture classifier Second, two stages are performed for the restoration of the regions occluded by the detected text regions: temporal restoration in consecutive frames and spatial restoration in the current frame. Utilizing text motion and background difference, an input video sequence is classified and a different temporal restoration scheme is applied to the sequence. Such a combination of temporal restoration and spatial restoration shows great potential for automatic detection and removal of objects of interest in various kinds of video sequences, and is applicable to many applications such as translation of captions and replacement of indirect advertisements in videos.

Optimization-based method for structural damage detection with consideration of uncertainties- a comparative study

  • Ghiasi, Ramin;Ghasemi, Mohammad Reza
    • Smart Structures and Systems
    • /
    • v.22 no.5
    • /
    • pp.561-574
    • /
    • 2018
  • In this paper, for efficiently reducing the computational cost of the model updating during the optimization process of damage detection, the structural response is evaluated using properly trained surrogate model. Furthermore, in practice uncertainties in the FE model parameters and modelling errors are inevitable. Hence, an efficient approach based on Monte Carlo simulation is proposed to take into account the effect of uncertainties in developing a surrogate model. The probability of damage existence (PDE) is calculated based on the probability density function of the existence of undamaged and damaged states. The current work builds a framework for Probability Based Damage Detection (PBDD) of structures based on the best combination of metaheuristic optimization algorithm and surrogate models. To reach this goal, three popular metamodeling techniques including Cascade Feed Forward Neural Network (CFNN), Least Square Support Vector Machines (LS-SVMs) and Kriging are constructed, trained and tested in order to inspect features and faults of each algorithm. Furthermore, three wellknown optimization algorithms including Ideal Gas Molecular Movement (IGMM), Particle Swarm Optimization (PSO) and Bat Algorithm (BA) are utilized and the comparative results are presented accordingly. Furthermore, efficient schemes are implemented on these algorithms to improve their performance in handling problems with a large number of variables. By considering various indices for measuring the accuracy and computational time of PBDD process, the results indicate that combination of LS-SVM surrogate model by IGMM optimization algorithm have better performance in predicting the of damage compared with other methods.

Application of groundwater-level prediction models using data-based learning algorithms to National Groundwater Monitoring Network data (자료기반 학습 알고리즘을 이용한 지하수위 변동 예측 모델의 국가지하수관측망 자료 적용에 대한 비교 평가 연구)

  • Yoon, Heesung;Kim, Yongcheol;Ha, Kyoochul;Kim, Gyoo-Bum
    • The Journal of Engineering Geology
    • /
    • v.23 no.2
    • /
    • pp.137-147
    • /
    • 2013
  • For the effective management of groundwater resources, it is necessary to predict groundwater level fluctuations in response to rainfall events. In the present study, time series models using artificial neural networks (ANNs) and support vector machines (SVMs) have been developed and applied to groundwater level data from the Gasan, Shingwang, and Cheongseong stations of the National Groundwater Monitoring Network. We designed four types of model according to input structure and compared their performances. The results show that the rainfall input model is not effective, especially for the prediction of groundwater recession behavior; however, the rainfall-groundwater input model is effective for the entire prediction stage, yielding a high model accuracy. Recursive prediction models were also effective, yielding correlation coefficients of 0.75-0.95 with observed values. The prediction errors were highest for Shingwang station, where the cross-correlation coefficient is lowest among the stations. Overall, the model performance of SVM models was slightly higher than that of ANN models for all cases. Assessment of the model parameter uncertainty of the recursive prediction models, using the ratio of errors in the validation stage to that in the calibration stage, showed that the range of the ratio is much narrower for the SVM models than for the ANN models, which implies that the SVM models are more stable and effective for the present case studies.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.