• Title/Summary/Keyword: SVM Model

Search Result 698, Processing Time 0.028 seconds

The System Of Microarray Data Classification Using Significant Gene Combination Method based on Neural Network. (신경망 기반의 유전자조합을 이용한 마이크로어레이 데이터 분류 시스템)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.7
    • /
    • pp.1243-1248
    • /
    • 2008
  • As development in technology of bioinformatics recently mates it possible to operate micro-level experiments, we can observe the expression pattern of total genome through on chip and analyze the interactions of thousands of genes at the same time. In this thesis, we used CDNA microarrays of 3840 genes obtained from neuronal differentiation experiment of cortical stem cells on white mouse with cancer. It analyzed and compared performance of each of the experiment result using existing DT, NB, SVM and multi-perceptron neural network classifier combined the similar scale combination method after constructing class classification model by extracting significant gene list with a similar scale combination method proposed in this paper through normalization. Result classifying in Multi-Perceptron neural network classifier for selected 200 genes using combination of PC(Pearson correlation coefficient) and ED(Euclidean distance coefficient) represented the accuracy of 98.84%, which show that it improve classification performance than case to experiment using other classifier.

A Study on Detection of Small Size Malicious Code using Data Mining Method (데이터 마이닝 기법을 이용한 소규모 악성코드 탐지에 관한 연구)

  • Lee, Taek-Hyun;Kook, Kwang-Ho
    • Convergence Security Journal
    • /
    • v.19 no.1
    • /
    • pp.11-17
    • /
    • 2019
  • Recently, the abuse of Internet technology has caused economic and mental harm to society as a whole. Especially, malicious code that is newly created or modified is used as a basic means of various application hacking and cyber security threats by bypassing the existing information protection system. However, research on small-capacity executable files that occupy a large portion of actual malicious code is rather limited. In this paper, we propose a model that can analyze the characteristics of known small capacity executable files by using data mining techniques and to use them for detecting unknown malicious codes. Data mining analysis techniques were performed in various ways such as Naive Bayesian, SVM, decision tree, random forest, artificial neural network, and the accuracy was compared according to the detection level of virustotal. As a result, more than 80% classification accuracy was verified for 34,646 analysis files.

The prediction of appearance of jellyfish through Deep Neural Network (심층신경망을 통한 해파리 출현 예측)

  • HWANG, CHEOLHUN;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.20 no.5
    • /
    • pp.1-8
    • /
    • 2019
  • This paper carried out a study to reduce damage from jellyfish whose population has increased due to global warming. The emergence of jellyfish on the beach could result in casualties from jellyfish stings and economic losses from closures. This paper confirmed from the preceding studies that the pattern of jellyfish's appearance is predictable through machine learning. This paper is an extension of The prediction model of emergence of Busan coastal jellyfish using SVM. In this paper, we used deep neural network to expand from the existing methods of predicting the existence of jellyfish to the classification by index. Due to the limitations of the small amount of data collected, the 84.57% prediction accuracy limit was sought to be resolved through data expansion using bootstraping. The expanded data showed about 7% higher performance than the original data, and about 6% better performance compared to the transfer learning. Finally, we used the test data to confirm the prediction performance of jellyfish appearance. As a result, although it has been confirmed that jellyfish emergence binary classification can be predicted with high accuracy, predictions through indexation have not produced meaningful results.

Machine Learning-based Classification of Hyperspectral Imagery

  • Haq, Mohd Anul;Rehman, Ziaur;Ahmed, Ahsan;Khan, Mohd Abdul Rahim
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.193-202
    • /
    • 2022
  • The classification of hyperspectral imagery (HSI) is essential in the surface of earth observation. Due to the continuous large number of bands, HSI data provide rich information about the object of study; however, it suffers from the curse of dimensionality. Dimensionality reduction is an essential aspect of Machine learning classification. The algorithms based on feature extraction can overcome the data dimensionality issue, thereby allowing the classifiers to utilize comprehensive models to reduce computational costs. This paper assesses and compares two HSI classification techniques. The first is based on the Joint Spatial-Spectral Stacked Autoencoder (JSSSA) method, the second is based on a shallow Artificial Neural Network (SNN), and the third is used the SVM model. The performance of the JSSSA technique is better than the SNN classification technique based on the overall accuracy and Kappa coefficient values. We observed that the JSSSA based method surpasses the SNN technique with an overall accuracy of 96.13% and Kappa coefficient value of 0.95. SNN also achieved a good accuracy of 92.40% and a Kappa coefficient value of 0.90, and SVM achieved an accuracy of 82.87%. The current study suggests that both JSSSA and SNN based techniques prove to be efficient methods for hyperspectral classification of snow features. This work classified the labeled/ground-truth datasets of snow in multiple classes. The labeled/ground-truth data can be valuable for applying deep neural networks such as CNN, hybrid CNN, RNN for glaciology, and snow-related hazard applications.

Recognition of Indoor and Outdoor Exercising Activities using Smartphone Sensors and Machine Learning (스마트폰 센서와 기계학습을 이용한 실내외 운동 활동의 인식)

  • Kim, Jaekyung;Ju, YeonHo
    • Journal of Creative Information Culture
    • /
    • v.7 no.4
    • /
    • pp.235-242
    • /
    • 2021
  • Recently, many human activity recognition(HAR) researches using smartphone sensor data have been studied. HAR can be utilized in various fields, such as life pattern analysis, exercise measurement, and dangerous situation detection. However researches have been focused on recognition of basic human behaviors or efficient battery use. In this paper, exercising activities performed indoors and outdoors were defined and recognized. Data collection and pre-processing is performed to recognize the defined activities by SVM, random forest and gradient boosting model. In addition, the recognition result is determined based on voting class approach for accuracy and stable performance. As a result, the proposed activities were recognized with high accuracy and in particular, similar types of indoor and outdoor exercising activities were correctly classified.

Vibration Data Denoising and Performance Comparison Using Denoising Auto Encoder Method (Denoising Auto Encoder 기법을 활용한 진동 데이터 전처리 및 성능비교)

  • Jang, Jun-gyo;Noh, Chun-myoung;Kim, Sung-soo;Lee, Soon-sup;Lee, Jae-chul
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.7
    • /
    • pp.1088-1097
    • /
    • 2021
  • Vibration data of mechanical equipment inevitably have noise. This noise adversely af ects the maintenance of mechanical equipment. Accordingly, the performance of a learning model depends on how effectively the noise of the data is removed. In this study, the noise of the data was removed using the Denoising Auto Encoder (DAE) technique which does not include the characteristic extraction process in preprocessing time series data. In addition, the performance was compared with that of the Wavelet Transform, which is widely used for machine signal processing. The performance comparison was conducted by calculating the failure detection rate. For a more accurate comparison, a classification performance evaluation criterion, the F-1 Score, was calculated. Failure data were detected using the One-Class SVM technique. The performance comparison, revealed that the DAE technique performed better than the Wavelet Transform technique in terms of failure diagnosis and error rate.

A Study on the prediction of SOH estimation of waste lithium-ion batteries based on SVM model (서포트 벡터 머신 기반 폐리튬이온전지의 건전성(SOH)추정 예측에 관한 연구)

  • KIM SANGBUM;KIM KYUHA;LEE SANGHYUN
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.727-730
    • /
    • 2023
  • The operation of electric automatic windows is used in harsh environments, and the energy density decreases as charging and discharging are repeated, and as soundness deteriorates due to damage to the internal separator, the vehicle's mileage decreases and the charging speed slows down, so about 5 to 10 Batteries that have been used for about a year are classified as waste batteries, and for this reason, as the risk of battery fire and explosion increases, it is essential to diagnose batteries and estimate SOH. Estimation of current battery SOH is a very important content, and it evaluates the state of the battery by measuring the time, temperature, and voltage required while repeatedly charging and discharging the battery. There are disadvantages. In this paper, measurement of discharge capacity (C-rate) using a waste battery of a Tesla car in order to predict SOH estimation of a lithium-ion battery. A Support Vector Machine (SVM), one of the machine models, was applied using the data measured from the waste battery.

Development of Type 2 Prediction Prediction Based on Big Data (빅데이터 기반 2형 당뇨 예측 알고리즘 개발)

  • Hyun Sim;HyunWook Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.999-1008
    • /
    • 2023
  • Early prediction of chronic diseases such as diabetes is an important issue, and improving the accuracy of diabetes prediction is especially important. Various machine learning and deep learning-based methodologies are being introduced for diabetes prediction, but these technologies require large amounts of data for better performance than other methodologies, and the learning cost is high due to complex data models. In this study, we aim to verify the claim that DNN using the pima dataset and k-fold cross-validation reduces the efficiency of diabetes diagnosis models. Machine learning classification methods such as decision trees, SVM, random forests, logistic regression, KNN, and various ensemble techniques were used to determine which algorithm produces the best prediction results. After training and testing all classification models, the proposed system provided the best results on XGBoost classifier with ADASYN method, with accuracy of 81%, F1 coefficient of 0.81, and AUC of 0.84. Additionally, a domain adaptation method was implemented to demonstrate the versatility of the proposed system. An explainable AI approach using the LIME and SHAP frameworks was implemented to understand how the model predicts the final outcome.

Proposing the Method for Improving the Forecast Accuracy of Loan Underwriting (대출심사의 예측 정확도 향상을 위한 방법 제안)

  • Yang, Yu-Young;Park, Sang-Sung;Shin, Young-Geun;Jang, Dong-Sik
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.4
    • /
    • pp.1419-1429
    • /
    • 2010
  • Industry structure and environment of the domestic bank have been changed by an influx of large foreign-banks and advanced financial products when the currency crisis erupted in Korea. In a competitive environment, accurate forecasts of changes and tendencies are essential for the survival and development. Forecast of whether to approve loan applications for customer or not is an important matter because that is related to profit generation and risk management on the bank. Therefore, this paper proposes the method to improve forecast accuracy of loan underwriting. Processes in experiments are as follows. First, we select the predictor variables which affect significantly to the result of loan underwriting by correlation analysis and feature selection technique, and then cluster the customers by the 2-Step clustering technique based on selected variables. Second, we find the most accurate forecasting model for each clustering by applying LR, NN and SVM. Finally, we compare the forecasting accuracy of the proposed method with the forecasting accuracy of existing application way.

Digital Modulation Types Recognition using HOS and WT in Multipath Fading Environments (다중경로 페이딩 환경에서 HOS와 WT을 이용한 디지털 변조형태 인식)

  • Park, Cheol-Sun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.5
    • /
    • pp.102-109
    • /
    • 2008
  • In this paper, the robust hybrid modulation type classifier which use both HOS and WT key features and can recognize 10 digitally modulated signals without a priori information in multipath fading channel conditions is proposed. The proposed classifier developed using data taken field measurements in various propagation model (i,e., rural area, small town and urban area) for real world scenarios. The 9 channel data are used for supervised training and the 6 channel data are used for testing among total 15 channel data(i.e., holdout-like method). The Proposed classifier is based on HOS key features because they are relatively robust to signal distortion in AWGN and multipath environments, and combined WT key features for classifying MQAM(M=16, 64, 256) signals which are difficult to classify without equalization scheme such as AMA(Alphabet Matched Algorithm) or MMA(Multi-modulus Algorithm. To investigate the performance of proposed classifier, these selected key features are applied in SVM(Support Vector Machine) which is known to having good capability of classifying because of mapping input space to hyperspace for margin maximization. The Pcc(Probability of correct classification) of the proposed classifier shows higher than those of classifiers using only HOS or WT key features in both training channels and testing channels. Especially, the Pccs of MQAM 3re almost perfect in various SNR levels.