• Title/Summary/Keyword: Cross - Validation

Search Result 994, Processing Time 0.026 seconds

PowerShell-based Malware Detection Method Using Command Execution Monitoring and Deep Learning (명령 실행 모니터링과 딥 러닝을 이용한 파워셸 기반 악성코드 탐지 방법)

  • Lee, Seung-Hyeon;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.5
    • /
    • pp.1197-1207
    • /
    • 2018
  • PowerShell is command line shell and scripting language, built on the .NET framework, and it has several advantages as an attack tool, including built-in support for Windows, easy code concealment and persistence, and various pen-test frameworks. Accordingly, malwares using PowerShell are increasing rapidly, however, there is a limit to cope with the conventional malware detection technique. In this paper, we propose an improved monitoring method to observe commands executed in the PowerShell and a deep learning based malware classification model that extract features from commands using Convolutional Neural Network(CNN) and send them to Recurrent Neural Network(RNN) according to the order of execution. As a result of testing the proposed model with 5-fold cross validation using 1,916 PowerShell-based malwares collected at malware sharing site and 38,148 benign scripts disclosed by an obfuscation detection study, it shows that the model effectively detects malwares with about 97% True Positive Rate(TPR) and 1% False Positive Rate(FPR).

Self Introduction Essay Classification Using Doc2Vec for Efficient Job Matching (Doc2Vec 모형에 기반한 자기소개서 분류 모형 구축 및 실험)

  • Kim, Young Soo;Moon, Hyun Sil;Kim, Jae Kyeong
    • Journal of Information Technology Services
    • /
    • v.19 no.1
    • /
    • pp.103-112
    • /
    • 2020
  • Job seekers are making various efforts to find a good company and companies attempt to recruit good people. Job search activities through self-introduction essay are nowadays one of the most active processes. Companies spend time and cost to reviewing all of the numerous self-introduction essays of job seekers. Job seekers are also worried about the possibility of acceptance of their self-introduction essays by companies. This research builds a classification model and conducted an experiments to classify self-introduction essays into pass or fail using deep learning and decision tree techniques. Real world data were classified using stratified sampling to alleviate the data imbalance problem between passed self-introduction essays and failed essays. Documents were embedded using Doc2Vec method developed from existing Word2Vec, and they were classified using logistic regression analysis. The decision tree model was chosen as a benchmark model, and K-fold cross-validation was conducted for the performance evaluation. As a result of several experiments, the area under curve (AUC) value of PV-DM results better than that of other models of Doc2Vec, i.e., PV-DBOW and Concatenate. Furthmore PV-DM classifies passed essays as well as failed essays, while PV_DBOW can not classify passed essays even though it classifies well failed essays. In addition, the classification performance of the logistic regression model embedded using the PV-DM model is better than the decision tree-based classification model. The implication of the experimental results is that company can reduce the cost of recruiting good d job seekers. In addition, our suggested model can help job candidates for pre-evaluating their self-introduction essays.

Application of Cokriging for the Estimation of Groundwater Level Distribution at the Nanjido Waste Landfill Area (난지도 매립지 일대의 지하수위 분포 추정을 위한 복합 크리깅의 응용)

  • 정상용;이강근
    • Journal of the Korean Society of Groundwater Environment
    • /
    • v.2 no.2
    • /
    • pp.58-63
    • /
    • 1995
  • Cokriging was applied for the estimation of the water levels of the basal leachate and the surrounding groundwater at the Nanjido waste landfill area. When the groundwater level is estimated at the high relief area, it makes a good result to use the data of groundwater level and elevation simultaneously because groundwater level is correlated with topography. This study determined the best semivariogram model of 87 groundwater levels and 144 elevations through cross validation test, and produced the contour maps of groundwater levels using ordinary kriging and universal kiging. Two contour maps don't make big difference at the waste site because this area has a large number of groundwater level data. However, they show big difference at the upper left part of the study area because this area has high relief and a small number of sample data. Their difference is also found at the south area near the Han river. When the topography is considered for the both areas, the contour map of cokriging is thought to be closer to the real groundwater distribution than that of kriging.

  • PDF

Implementing the Urban Effect in an Interpolation Scheme for Monthly Normals of Daily Minimum Temperature (도시효과를 고려한 일 최저기온의 월별 평년값 분포 추정)

  • 최재연;윤진일
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.4 no.4
    • /
    • pp.203-212
    • /
    • 2002
  • This study was conducted to remove the urban heat island effects embedded in the interpolated surfaces of daily minimum temperature in the Korean Peninsula. Fifty six standard weather stations are usually used to generate the gridded temperature surface in South Korea. Since most of the weather stations are located in heavily populated and urbanized areas, the observed minimum temperature data are contaminated with the so-called urban heat island effect. Without an appropriate correction, temperature estimates over rural area or forests might deviate significantly from the actual values. We simulated the spatial pattern of population distribution within any single population reporting district (city or country) by allocating the reported population to the "urban" pixels of a land cover map with a 30 by 30 m spacing. By using this "digital population model" (DPM), we can simulate the horizontal diffusion of urban effect, which is not possible with the spatially discontinuous nature of the population statistics fer each city or county. The temperature estimation error from the existing interpolation scheme, which considers both the distance and the altitude effects, was regressed to the DPMs smoothed at 5 different scales, i.e., the radial extent of 0.5, 1.5, 2.5, 3.5 and 5.0 km. Optimum regression models were used in conjunction with the distance-altitude interpolation to predict monthly normals of daily minimum temperature in South Korea far 1971-2000 period. Cross validation showed around 50% reduction in terms of RMSE and MAE over all months compared with those by the conventional method.conventional method.

A Melon Fruit Grading Machine Using a Miniature VIS/NIR Spectrometer: 1. Calibration Models for the Prediction of Soluble Solids Content and Firmness

  • Suh, Sang-Ryong;Lee, Kyeong-Hwan;Yu, Seung-Hwa;Shin, Hwa-Sun;Choi, Young-Soo;Yoo, Soo-Nam
    • Journal of Biosystems Engineering
    • /
    • v.37 no.3
    • /
    • pp.166-176
    • /
    • 2012
  • Purpose: This study was conducted to investigate the potential of interactance mode of NIR spectroscopy technology for the estimation of soluble solids content (SSC) and firmness of muskmelons. Methods: Melon samples were taken from local greenhouses in three different harvesting seasons (experiments 1, 2, and 3). The fruit attributes were measured at the 6 points on an equator of each sample where the spectral data were collected. The prediction models were developed using the original spectral data and the spectral data sets preprocessed by 20 methods. The performance of the models was compared. Results: In the prediction of SSC, the highest coefficient of determination ($R_{cv}{^2}$) values of the cross-validation was 0.755 (standard error of prediction, SEP=$0.89^{\circ}Brix$) with the preprocessing of normalization with range in experiment 1. The highest coefficient of determination in the robustness tests, $R_{rt}{^2}$=0.650 (SEP=$1.03^{\circ}Brix$), was found when the best model of experiment 3 was evaluated with the data set of experiment 2. The best $R_{cv}{^2}$ for the prediction of firmness was 0.715 (SEP=3.63 N) when no preprocessing was applied in experiment 1. The highest $R_{rt}{^2}$ was 0.404 (SEP=5.30 N) when the best model of experiment 3 was applied to the data set of experiment 1. Conclusions: From the test results, it can be concluded that the interactance mode of VIS/NIR spectroscopy technology has a great potential to measure SSC and firmness of thick-skinned muskmelons.

Quality Level Classification of ECG Measured using Non-Constraint Approach (무구속적 방법으로 측정된 심전도의 신뢰도 판별)

  • Kim, Y.J.;Heo, J.;Park, K.S.;Kim, S.
    • Journal of Biomedical Engineering Research
    • /
    • v.37 no.5
    • /
    • pp.161-167
    • /
    • 2016
  • Recent technological advances in sensor fabrication and bio-signal processing enabled non-constraint and non-intrusive measurement of human bio-signals. Especially, non-constraint measurement of ECG makes it available to estimate various human health parameters such as heart rate. Additionally, non-constraint ECG measurement of wheelchair user provides real-time health parameter information for emergency response. For accurate emergency response with low false alarm rate, it is necessary to discriminate quality levels of ECG measured using non-constraint approach. Health parameters acquired from low quality ECG results in inaccurate information. Thus, in this study, a machine learning based approach for three-class classification of ECG quality level is suggested. Three sensors are embedded in the back seat, chest belt, and handle of automatic wheelchair. For the two sensors embedded in back seat and chest belt, capacitively coupled electrodes were used. The accuracy of quality level classification was estimated using Monte Carlo cross validation. The proposed approach demonstrated accuracy of 94.01%, 95.57%, and 96.94% for each channel of three sensors. Furthermore, the implemented algorithm enables classification of user posture by detection of contacted electrodes. The accuracy for posture estimation was 94.57%. The proposed algorithm will contribute to non-constraint and robust estimation of health parameter of wheelchair users.

Dynamic Bayesian Network based Two-Hand Gesture Recognition (동적 베이스망 기반의 양손 제스처 인식)

  • Suk, Heung-Il;Sin, Bong-Kee
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.4
    • /
    • pp.265-279
    • /
    • 2008
  • The idea of using hand gestures for human-computer interaction is not new and has been studied intensively during the last dorado with a significant amount of qualitative progress that, however, has been short of our expectations. This paper describes a dynamic Bayesian network or DBN based approach to both two-hand gestures and one-hand gestures. Unlike wired glove-based approaches, the success of camera-based methods depends greatly on the image processing and feature extraction results. So the proposed method of DBN-based inference is preceded by fail-safe steps of skin extraction and modeling, and motion tracking. Then a new gesture recognition model for a set of both one-hand and two-hand gestures is proposed based on the dynamic Bayesian network framework which makes it easy to represent the relationship among features and incorporate new information to a model. In an experiment with ten isolated gestures, we obtained the recognition rate upwards of 99.59% with cross validation. The proposed model and the related approach are believed to have a strong potential for successful applications to other related problems such as sign languages.

Determination of Sasang Constitution from Artery Pulse Waves (요골 맥파를 이용한 사상체질 판별)

  • Cho, Jae Kyong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.2
    • /
    • pp.359-365
    • /
    • 2020
  • Sasang Constitution data that were classified by the QSCCII (Questionnaire for the Sasang Constitution Classification II) and artery pulse waves of Chon, Guan, and Chuck data measured using an electronic manometer, were obtained from 732 subjects who visited an oriental hospital. The pulse width, peak height, and number of peaks were extracted from the pulse waves as feature variables. Validity and reliability analyses were performed to obtain the feature variables. The feature variables with high validity and reliability were selected as the discriminant variables. The pulse wave data were divided into training and predicting samples by applying a fivefold cross-validation technique. Discriminant analysis was performed for the training sample, and discriminant functions were obtained. The discriminant functions were applied to the predicting sample and the Sasang Constitution was predicted. The accuracy of prediction was estimated by comparing the predicted Sasang Constitution and that obtained by QSCCII. The accuracy of the predicted Sasang Constitution before (after) age and sex calibration was 73.6 % (70.4 %), 68.4 % (84.2 %), and 74.2 % (67.7 %) for Taeumin, Soumin, and Soyangin, respectively, and 72.5 % (73.8 %) in total.

Structural performance of timber frame joints - Full scale tests and numerical validation

  • Aejaz, S.A.;Dar, A.R.;Bhat, J.A.
    • Structural Engineering and Mechanics
    • /
    • v.74 no.4
    • /
    • pp.457-470
    • /
    • 2020
  • The force resisting ability of a connection has direct implications on the overall response of a timber framed structure to various actions, thereby governing the integrity and safety of such constructions. The behavior of timber framed structures has been studied by many researchers by testing full-scale-connections in timber frames so as to establish consistent design provisions on the same. However, much emphasis in this approach has been unidirectional, that has focused on a particular connection configuration, with no research output stressing on the refinement of the existing connection details in order to optimize their performance. In this regard, addition of adhesive to dowelled timber connections is an economically effective technique that has a potential to improve their performance. Therefore, a comparative study to evaluate the performance of various full-scale timber frame Nailed connections (Bridled Tenon, Cross Halved, Dovetail Halved and Mortise Tenon) supplemented by adhesive with respect to Nailed-Only counterparts under tensile loading has been investigated in this paper. The load-deformation values measured have been used to calculate stiffness, load capacity and ductility in both the connection forms (with and without adhesion) which in turn have been compared to other configurations along with the observed failure modes. The observed load capacity of the tested models has also been compared to the design strengths predicted by National Design Specifications (NDS-2018) for timber construction. Additionally, the experimental behavior was validated by developing non-linear finite element models in ABAQUS. All the results showed incorporation of adhesive to be an efficient and an economical technique in significantly enhancing the performance of various timber nailed connections under tensile action. Thus, this research is novel in a sense that it not only explores the tensile behavior of different nailed joint configurations common in timber construction but also stresses on improvising the same in a logical manner hence making it distinctive in its approach.

Application of artificial neural networks to predict total dissolved solids in the river Zayanderud, Iran

  • Gholamreza, Asadollahfardi;Afshin, Meshkat-Dini;Shiva, Homayoun Aria;Nasrin, Roohani
    • Environmental Engineering Research
    • /
    • v.21 no.4
    • /
    • pp.333-340
    • /
    • 2016
  • An Artificial Neural Network including a Radial Basis Function (RBF) and a Time Delay Neural Network (TDNN) was used to predict total dissolved solid (TDS) in the river Zayanderud. Water quality parameters in the river for ten years, 2001-2010, were prepared from data monitored by the Isfahan Regional Water Authority. A factor analysis was applied to select the inputs of water quality parameters, which obtained total hardness, bicarbonate, chloride and calcium. Input data to the neural networks were pH, $Na^+$, $Mg^{2+}$, Carbonate ($CO{_3}^{-2}$), $HCO{_3}^{-1}$, $Cl^-$, $Ca^{2+}$ and Total hardness. For learning process 5-fold cross validation were applied. In the best situation, the TDNN contained 2 hidden layers of 15 neurons in each of the layers and the RBF had one hidden layer with 100 neurons. The Mean Squared Error and the Mean Bias Error for the TDNN during the training process were 0.0006 and 0.0603 and for the RBF neural network the mentioned errors were 0.0001 and 0.0006, respectively. In the RBF, the coefficient of determination ($R^2$) and the index of agreement (IA) between the observed data and predicted data were 0.997 and 0.999, respectively. In the TDNN, the $R^2$ and the IA between the actual and predicted data were 0.957 and 0.985, respectively. The results of sensitivity illustrated that $Ca^{2+}$ and $SO{_4}^{2-}$ parameters had the highest effect on the TDS prediction.