• Title/Summary/Keyword: Model Generalization

Search Result 444, Processing Time 0.03 seconds

GCNXSS: An Attack Detection Approach for Cross-Site Scripting Based on Graph Convolutional Networks

  • Pan, Hongyu;Fang, Yong;Huang, Cheng;Guo, Wenbo;Wan, Xuelin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.4008-4023
    • /
    • 2022
  • Since machine learning was introduced into cross-site scripting (XSS) attack detection, many researchers have conducted related studies and achieved significant results, such as saving time and labor costs by not maintaining a rule database, which is required by traditional XSS attack detection methods. However, this topic came across some problems, such as poor generalization ability, significant false negative rate (FNR) and false positive rate (FPR). Moreover, the automatic clustering property of graph convolutional networks (GCN) has attracted the attention of researchers. In the field of natural language process (NLP), the results of graph embedding based on GCN are automatically clustered in space without any training, which means that text data can be classified just by the embedding process based on GCN. Previously, other methods required training with the help of labeled data after embedding to complete data classification. With the help of the GCN auto-clustering feature and labeled data, this research proposes an approach to detect XSS attacks (called GCNXSS) to mine the dependencies between the units that constitute an XSS payload. First, GCNXSS transforms a URL into a word homogeneous graph based on word co-occurrence relationships. Then, GCNXSS inputs the graph into the GCN model for graph embedding and gets the classification results. Experimental results show that GCNXSS achieved successful results with accuracy, precision, recall, F1-score, FNR, FPR, and predicted time scores of 99.97%, 99.75%, 99.97%, 99.86%, 0.03%, 0.03%, and 0.0461ms. Compared with existing methods, GCNXSS has a lower FNR and FPR with stronger generalization ability.

A Study on Generalization of Security Policies for Enterprise Security Management System (통합보안관리시스템을 위한 보안정책 일반화에 관한 연구)

  • Choi, Hyun-H.;Chung, Tai-M.
    • The KIPS Transactions:PartC
    • /
    • v.9C no.6
    • /
    • pp.823-830
    • /
    • 2002
  • Enterprise security management system proposed to properly manage heterogeneous security products is the security management infrastructure designed to avoid needless duplications of management tasks and inter-operate those security products effectively. In this paper, we propose the model of generalized security policies. It is designed to help security management build invulnerable security policies that can unify various existing management infrastructures of security policies. Its goal is not only to improve security strength and increase the management efficiency and convenience but also to make it possible to include different security management infrastructures while building security policies. In the generalization process of security policies. we first diagnose the security status of monitored networks by analyzing security goals, requirements, and security-related information that security agents collect. Next, we decide the security mechanisms and objects for security policies, and then evaluate the properness of them on the basis of security goals, requirements and a policy list. With the generalization process, it is possible to integrate heterogeneous security policies and guarantee the integrity of them by avoiding conflicts or duplications among security policies. And further, it provides convenience to manage many security products existing in large networks.

Geometrically and Topographically Consistent Map Conflation for Federal and Local Governments (Geometry 및 Topology측면에서 일관성을 유지한 방법을 이용한 연방과 지방정부의 공간데이터 융합)

  • Kang, Ho-Seok
    • Journal of the Korean Geographical Society
    • /
    • v.39 no.5 s.104
    • /
    • pp.804-818
    • /
    • 2004
  • As spatial data resources become more abundant, the potential for conflict among them increases. Those conflicts can exist between two or many spatial datasets covering the same area and categories. Therefore, it becomes increasingly important to be able to effectively relate these spatial data sources with others then create new spatial datasets with matching geometry and topology. One extensive spatial dataset is US Census Bureau's TIGER file, which includes census tracts, block groups, and blocks. At present, however, census maps often carry information that conflicts with municipally-maintained detailed spatial information. Therefore, in order to fully utilize census maps and their valuable demographic and economic information, the locational information of the census maps must be reconciled with the more accurate municipally-maintained reference maps and imagery. This paper formulates a conceptual framework and two map models of map conflation to make geometrically and topologically consistent source maps according to the reference maps. The first model is based on the cell model of map in which a map is a cell complex consisting of 0-cells, 1-cells, and 2-cells. The second map model is based on a different set of primitive objects that remain homeomorphic even after map generalization. A new hierarchical based map conflation is also presented to be incorporated with physical, logical, and mathematical boundary and to reduce the complexity and computational load. Map conflation principles with iteration are formulated and census maps are used as a conflation example. They consist of attribute embedding, find meaning node, cartographic 0-cell match, cartographic 1-cell match, and map transformation.

Edge Computing Model based on Federated Learning for COVID-19 Clinical Outcome Prediction in the 5G Era

  • Ruochen Huang;Zhiyuan Wei;Wei Feng;Yong Li;Changwei Zhang;Chen Qiu;Mingkai Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.4
    • /
    • pp.826-842
    • /
    • 2024
  • As 5G and AI continue to develop, there has been a significant surge in the healthcare industry. The COVID-19 pandemic has posed immense challenges to the global health system. This study proposes an FL-supported edge computing model based on federated learning (FL) for predicting clinical outcomes of COVID-19 patients during hospitalization. The model aims to address the challenges posed by the pandemic, such as the need for sophisticated predictive models, privacy concerns, and the non-IID nature of COVID-19 data. The model utilizes the FATE framework, known for its privacy-preserving technologies, to enhance predictive precision while ensuring data privacy and effectively managing data heterogeneity. The model's ability to generalize across diverse datasets and its adaptability in real-world clinical settings are highlighted by the use of SHAP values, which streamline the training process by identifying influential features, thus reducing computational overhead without compromising predictive precision. The study demonstrates that the proposed model achieves comparable precision to specific machine learning models when dataset sizes are identical and surpasses traditional models when larger training data volumes are employed. The model's performance is further improved when trained on datasets from diverse nodes, leading to superior generalization and overall performance, especially in scenarios with insufficient node features. The integration of FL with edge computing contributes significantly to the reliable prediction of COVID-19 patient outcomes with greater privacy. The research contributes to healthcare technology by providing a practical solution for early intervention and personalized treatment plans, leading to improved patient outcomes and efficient resource allocation during public health crises.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

Generalization and implementation of hardening soil constitutive model in ABAQUS code

  • Bo Songa;Jun-Yan Liu;Yan Liu;Ping Hu
    • Geomechanics and Engineering
    • /
    • v.36 no.4
    • /
    • pp.355-366
    • /
    • 2024
  • The original elastoplastic Hardening Soil model is formulated actually partly under hexagonal pyramidal Mohr-Coulomb failure criterion, and can be only used in specific stress paths. It must be completely generalized under Mohr-Coulomb criterion before its usage in engineering practice. A set of generalized constitutive equations under this criterion, including shear and volumetric yield surfaces and hardening laws, is proposed for Hardening Soil model in principal stress space. On the other hand, a Mohr-Coulumb type yield surface in principal stress space comprises six corners and an apex that make singularity for the normal integration approach of constitutive equations. With respect to the isotropic nature of the material, a technique for processing these singularities by means of Koiter's rule, along with a transforming approach between both stress spaces for both stress tensor and consistent stiffness matrix based on spectral decomposition method, is introduced to provide such an approach for developing generalized Hardening Soil model in finite element analysis code ABAQUS. The implemented model is verified in comparison with the results after the original simulations of oedometer and triaxial tests by means of this model, for volumetric and shear hardenings respectively. Results from the simulation of oedometer test show similar shape of primary loading curve to the original one, while maximum vertical strain is a little overestimated for about 0.5% probably due to the selection of relationships for cap parameters. In simulation of triaxial test, the stress-strain and dilation curves are both in very good agreement with the original curves as well as test data.

Coprime Factor Reduction of Parameter Varying Controller

  • Saragih, Roberd;Widowati, Widowati
    • International Journal of Control, Automation, and Systems
    • /
    • v.6 no.6
    • /
    • pp.836-844
    • /
    • 2008
  • This paper presents an approach to order reduction of linear parameter varying controller for polytopic model. Feasible solutions which satisfy relevant linear matrix inequalities for constructing full-order parameter varying controller evaluated at each polytopic vertices are first found. Next, sufficient conditions are derived for the existence of a right coprime factorization of parameter varying controller. Furthermore, a singular perturbation approximation for time invariant systems is generalized to reduce full-order parameter varying controller via parameter varying right coprime factorization. This generalization is based on solutions of the parameter varying Lyapunov inequalities. The closed loop performance caused by using the reduced order controller is developed. To examine the performance of the reduced-order parameter varying controller, the proposed method is applied to reduce vibration of flexible structures having the transverse-torsional coupled vibration modes.

Simultaneous optimization method of feature transformation and weighting for artificial neural networks using genetic algorithm : Application to Korean stock market

  • Kim, Kyoung-jae;Ingoo Han
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 1999.10a
    • /
    • pp.323-335
    • /
    • 1999
  • In this paper, we propose a new hybrid model of artificial neural networks(ANNs) and genetic algorithm (GA) to optimal feature transformation and feature weighting. Previous research proposed several variants of hybrid ANNs and GA models including feature weighting, feature subset selection and network structure optimization. Among the vast majority of these studies, however, ANNs did not learn the patterns of data well, because they employed GA for simple use. In this study, we incorporate GA in a simultaneous manner to improve the learning and generalization ability of ANNs. In this study, GA plays role to optimize feature weighting and feature transformation simultaneously. Globally optimized feature weighting overcome the well-known limitations of gradient descent algorithm and globally optimized feature transformation also reduce the dimensionality of the feature space and eliminate irrelevant factors in modeling ANNs. By this procedure, we can improve the performance and enhance the generalisability of ANNs.

  • PDF

Optimal Cognitive System Modeling Using the Stimulus-Response Matrix (자극-반응 행렬을 이용한 인지 시스템 최적화 모델)

  • Choe, Gyeong-Hyeon;Park, Min-Yong;Im, Eun-Yeong
    • Journal of the Ergonomics Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.11-22
    • /
    • 2000
  • In this research report, we are presenting several optimization models for cognitive systems by using stimulus-response matrix (S-R Matrix). Stimulus-response matrices are widely used for tabulating results from various experiments and cognition systems design in which the recognition and confusability of stimuli. This paper is relevant to analyze the optimization/mathematical programming models. The weakness and restrictions of the existing models are resolved by generalization considering average confusion of each subset of stimuli. Also, clustering strategies are used in the extended model to obtain centers of cluster in terms of minimal confusion as well as the character of each cluster.

  • PDF

New Analysis on the Generalization of SC Systems for the Reception of M-ary Signals over Rayleigh Fading Channels

  • Yoon Jae-Yeun;Kim Chang-Hwan;Chin Yong-Ok
    • Journal of electromagnetic engineering and science
    • /
    • v.4 no.4
    • /
    • pp.175-182
    • /
    • 2004
  • When the M-ary signal experiences the Rayleigh fading, the diversity schemes can reduce the effect of fading since the probability that all the signals components will fade simultaneously is reduced considerably. The symbol error probabilities for various M-ary signals, such as MDPSK(M-ary DPSK) and MPSK(M-ary PSK), are mathematically derived for the Selection Combining 2(SC-2) and Selection Combining 3(SC-3) demodulation system which requires a less complex receiver than Maximum Ratio Combining(MRC). The propagation model used in this paper is the frequency-nonselective slow Rayleigh fading channel corrupted by the Additive White Gaussian Noise(AWGN). The numerical results presented in this paper are expected to provide information for the design of radio system using M-ary modulation method for above mentioned channel environment.