• Title/Summary/Keyword: 10-fold cross-validation

Search Result 213, Processing Time 0.026 seconds

PowerShell-based Malware Detection Method Using Command Execution Monitoring and Deep Learning (명령 실행 모니터링과 딥 러닝을 이용한 파워셸 기반 악성코드 탐지 방법)

  • Lee, Seung-Hyeon;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.5
    • /
    • pp.1197-1207
    • /
    • 2018
  • PowerShell is command line shell and scripting language, built on the .NET framework, and it has several advantages as an attack tool, including built-in support for Windows, easy code concealment and persistence, and various pen-test frameworks. Accordingly, malwares using PowerShell are increasing rapidly, however, there is a limit to cope with the conventional malware detection technique. In this paper, we propose an improved monitoring method to observe commands executed in the PowerShell and a deep learning based malware classification model that extract features from commands using Convolutional Neural Network(CNN) and send them to Recurrent Neural Network(RNN) according to the order of execution. As a result of testing the proposed model with 5-fold cross validation using 1,916 PowerShell-based malwares collected at malware sharing site and 38,148 benign scripts disclosed by an obfuscation detection study, it shows that the model effectively detects malwares with about 97% True Positive Rate(TPR) and 1% False Positive Rate(FPR).

A Novel Method for Emotion Recognition based on the EEG Signal using Gradients (EEG 신호 기반 경사도 방법을 통한 감정인식에 대한 연구)

  • Han, EuiHwan;Cha, HyungTai
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.7
    • /
    • pp.71-78
    • /
    • 2017
  • There are several algorithms to classify emotion, such as Support-vector-machine (SVM), Bayesian decision rule, etc. However, many researchers have insisted that these methods have minor problems. Therefore, in this paper, we propose a novel method for emotion recognition based on Electroencephalogram (EEG) signal using the Gradient method which was proposed by Han. We also utilize a database for emotion analysis using physiological signals (DEAP) to obtain objective data. And we acquire four channel brainwaves, including Fz (${\alpha}$), Fp2 (${\beta}$), F3 (${\alpha}$), F4 (${\alpha}$) which are selected in previous study. We use 4 features which are power spectral density (PSD) of the above channels. According to performance evaluation (4-fold cross validation), we could get 85% accuracy in valence axis and 87.5% in arousal. It is 5-7% higher than existing method's.

Optimal number of dimensions in linear discriminant analysis for sparse data (희박한 데이터에 대한 선형판별분석에서 최적의 차원 수 결정)

  • Shin, Ga In;Kim, Jaejik
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.867-876
    • /
    • 2017
  • Datasets with small n and large p are often found in various fields and the analysis of the datasets is still a challenge in statistics. Discriminant analysis models for such datasets were recently developed in classification problems. One approach of those models tries to detect dimensions that distinguish between groups well and the number of the detected dimensions is typically smaller than p. In such models, the number of dimensions is important because the prediction and visualization of data and can be usually determined by the K-fold cross-validation (CV). However, in sparse data scenarios, the CV is not reliable for determining the optimal number of dimensions since there can be only a few observations for each fold. Thus, we propose a method to determine the number of dimensions using a measure based on the standardized distance between the mean values of each group in the reduced dimensions. The proposed method is verified through simulations.

Hand-held Multimedia Device Identification Based on Audio Source (음원을 이용한 멀티미디어 휴대용 단말장치 판별)

  • Lee, Myung Hwan;Jang, Tae Ung;Moon, Chang Bae;Kim, Byeong Man;Oh, Duk-Hwan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-83
    • /
    • 2014
  • Thanks to the development of diverse audio editing Technology, audio file can be easily revised. As a result, diverse social problems like forgery may be caused. Digital forensic technology is actively studied to solve these problems. In this paper, a hand-held device identification method, an area of digital forensic technology is proposed. It uses the noise features of devices caused by the design and the integrated circuit of each device but cannot be identified by the audience. Wiener filter is used to get the noise sounds of devices and their acoustic features are extracted via MIRtoolbox and then they are trained by multi-layer neural network. To evaluate the proposed method, we use 5-fold cross-validation for the recorded data collected from 6 mobile devices. The experiments show the performance 99.9%. We also perform some experiments to observe the noise features of mobile devices are still useful after the data are uploaded to UCC. The experiments show the performance of 99.8% for UCC data.

Transformational Leadership and Depressive Symptoms in Germany: Validation of a Short Transformational Leadership Scale

  • Seegel, Max Leonhard;Herr, Raphael M.;Schneider, Michael;Schmidt, Burkhard;Fischer, Joachim E.
    • Journal of Preventive Medicine and Public Health
    • /
    • v.52 no.3
    • /
    • pp.161-169
    • /
    • 2019
  • Objectives: The objective of the present study was to validate a shortened transformational leadership (TL) scale (12 items) comprising core TL behaviour and to test the associations of this shortened TL scale with depressive symptoms. Methods: The study used cross-sectional data from 1632 employees of the overall workforce of a middle-sized German company (51.6% men; mean age, 41.35 years; standard deviation, 9.4 years). TL was assessed with the German version of the Transformational Leadership Inventory and depressive symptoms with the Hospital Anxiety and Depression Scale (HADS). The structural validity of the core TL scale was assessed with confirmatory factor analysis. Associations with depressive symptoms were estimated with structural equation modelling and adjusted logistic regression. Results: Confirmatory factor analysis and structural equation modelling showed better model fit for the core TL than for the full TL score. Logistic regression revealed 3.61-fold (95% confidence interval [CI], 2.20 to 5.93: women) to 4.46-fold (95% CI, 2.86 to 6.95: men) increased odds of reporting depressive symptoms (HADS score >8) for those in the lowest tertile of reported core TL. Conclusions: The shortened core TL seems to be a valid instrument for research and training purposes in the context of TL and depressive symptoms in employees. Of particular note, men reporting poor TL were more likely to report depressive symptoms.

Predictive Models for the Tourism and Accommodation Industry in the Era of Smart Tourism: Focusing on the COVID-19 Pandemic (스마트관광 시대의 관광숙박업 영업 예측 모형: 코로나19 팬더믹을 중심으로)

  • Yu Jin Jo;Cha Mi Kim;Seung Yeon Son;Mi Jin Noh
    • Smart Media Journal
    • /
    • v.12 no.8
    • /
    • pp.18-25
    • /
    • 2023
  • The COVID-19 outbreak in 2020 caused continuous damage worldwode, especially the smart tourism industry was hit directly by the blockade of sky roads and restriction of going out. At a time when overseas travel and domestic travel have decreased significantly, the number of tourist hotels that are colsed and closed due to the continued deficit is increasing. Therefore, in this study, licensing data from the Ministry of Public Administraion and Security were collected and visualized to understand the operation status of the tourism and lodging industry. The machine learning classification algorithm was applied to implement the business status prediction model of the tourist hotel, the performance of the prediction model was optimized using the ensemble algorithm, and the performance of the model was evaluated through 5-Fold cross-validation. It was predicted that the survival rate of tourist hotels would decrease somewhat, but the actual survival rate was analyzed to be no different from before COVID-19. Through the prediction of the business status of the hotel industry in this paper, it can be used as a basis for grasping the operability and development trends of the entire tourism and lodging industry.

Development of Type 2 Prediction Prediction Based on Big Data (빅데이터 기반 2형 당뇨 예측 알고리즘 개발)

  • Hyun Sim;HyunWook Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.999-1008
    • /
    • 2023
  • Early prediction of chronic diseases such as diabetes is an important issue, and improving the accuracy of diabetes prediction is especially important. Various machine learning and deep learning-based methodologies are being introduced for diabetes prediction, but these technologies require large amounts of data for better performance than other methodologies, and the learning cost is high due to complex data models. In this study, we aim to verify the claim that DNN using the pima dataset and k-fold cross-validation reduces the efficiency of diabetes diagnosis models. Machine learning classification methods such as decision trees, SVM, random forests, logistic regression, KNN, and various ensemble techniques were used to determine which algorithm produces the best prediction results. After training and testing all classification models, the proposed system provided the best results on XGBoost classifier with ADASYN method, with accuracy of 81%, F1 coefficient of 0.81, and AUC of 0.84. Additionally, a domain adaptation method was implemented to demonstrate the versatility of the proposed system. An explainable AI approach using the LIME and SHAP frameworks was implemented to understand how the model predicts the final outcome.

Descriptor-Based Profile Analysis of Kinase Inhibitors to Predict Inhibitory Activity and to Grasp Kinase Selectivity

  • Park, Hyejin;Kim, Kyeung Kyu;Kim, ChangHoon;Shin, Jae-Min;No, Kyoung Tai
    • Bulletin of the Korean Chemical Society
    • /
    • v.34 no.9
    • /
    • pp.2680-2684
    • /
    • 2013
  • Protein kinases (PKs) are an important source of drug targets, especially in oncology. With 500 or more kinases in the human genome and only few kinase inhibitors approved, kinase inhibitor discovery is becoming more and more valuable. Because the discovery of kinase inhibitors with an increased selectivity is an important therapeutic concept, many researchers have been trying to address this issue with various methodologies. Although many attempts to predict the activity and selectivity of kinase inhibitors have been made, the issue of selectivity has not yet been resolved. Here, we studied kinase selectivity by generating predictive models and analyzing their descriptors by using kinase-profiling data. The 5-fold cross-validation accuracies for the 51 models were between 72.4% and 93.7% and the ROC values for all the 51 models were over 0.7. The phylogenetic tree based on the descriptor distance is quite different from that generated on the basis of sequence alignment.

Ship Detection Using Edge-Based Segmentation and Histogram of Oriented Gradient with Ship Size Ratio

  • Eum, Hyukmin;Bae, Jaeyun;Yoon, Changyong;Kim, Euntai
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.15 no.4
    • /
    • pp.251-259
    • /
    • 2015
  • In this paper, a ship detection method is proposed; this method uses edge-based segmentation and histogram of oriented gradient (HOG) with the ship size ratio. The proposed method can prevent a marine collision accident by detecting ships at close range. Furthermore, unlike radar, the method can detect ships that have small size and absorb radio waves because it involves the use of a vision-based system. This system performs three operations. First, the foreground is separated from the background and candidates are detected using Sobel edge detection and morphological operations in the edge-based segmentation part. Second, features are extracted by employing HOG descriptors with the ship size ratio from the detected candidate. Finally, a support vector machine (SVM) verifies whether the candidates are ships. The performance of these methods is demonstrated by comparing their results with the results of other segmentation methods using eight-fold cross validation for the experimental results.

Non-destructive assessment of the three-point-bending strength of mortar beams using radial basis function neural networks

  • Alexandridis, Alex;Stavrakas, Ilias;Stergiopoulos, Charalampos;Hloupis, George;Ninos, Konstantinos;Triantis, Dimos
    • Computers and Concrete
    • /
    • v.16 no.6
    • /
    • pp.919-932
    • /
    • 2015
  • This paper presents a new method for assessing the three-point-bending (3PB) strength of mortar beams in a non-destructive manner, based on neural network (NN) models. The models are based on the radial basis function (RBF) architecture and the fuzzy means algorithm is employed for training, in order to boost the prediction accuracy. Data for training the models were collected based on a series of experiments, where the cement mortar beams were subjected to various bending mechanical loads and the resulting pressure stimulated currents (PSCs) were recorded. The input variables to the NN models were then calculated by describing the PSC relaxation process through a generalization of Boltzmannn-Gibbs statistical physics, known as non-extensive statistical physics (NESP). The NN predictions were evaluated using k-fold cross-validation and new data that were kept independent from training; it can be seen that the proposed method can successfully form the basis of a non-destructive tool for assessing the bending strength. A comparison with a different NN architecture confirms the superiority of the proposed approach.