• Title/Summary/Keyword: Feature Weighting

Search Result 127, Processing Time 0.024 seconds

Matchmaker: Fuzzy Vault Scheme for Weighted Preference (매치메이커: 선호도를 고려한 퍼지 볼트 기법)

  • Purevsuren, Tuvshinkhuu;Kang, Jeonil;Nyang, DaeHun;Lee, KyungHee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.26 no.2
    • /
    • pp.301-314
    • /
    • 2016
  • Juels and Sudan's fuzzy vault scheme has been applied to various researches due to its error-tolerance property. However, the fuzzy vault scheme does not consider the difference between people's preferences, even though the authors instantiated movie lover' case in their paper. On the other hand, to make secure and high performance face authentication system, Nyang and Lee introduced a face authentication system, so-called fuzzy face vault, that has a specially designed association structure between face features and ordinary fuzzy vault in order to let each face feature have different weight. However, because of optimizing intra/inter class difference of underlying feature extraction methods, we can easily expect that the face authentication system does not successfully decrease the face authentication failure. In this paper, for ensuring the flexible use of the fuzzy vault scheme, we introduce the bucket structure, which differently implements the weighting idea of Nyang and Lee's face authentication system, and three distribution functions, which formalize the relation between user's weight of preferences and system implementation. In addition, we suggest a matchmaker scheme based on them and confirm its computational performance through the movie database.

A Study on Random Selection of Pooling Operations for Regularization and Reduction of Cross Validation (정규화 및 교차검증 횟수 감소를 위한 무작위 풀링 연산 선택에 관한 연구)

  • Ryu, Seo-Hyeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.4
    • /
    • pp.161-166
    • /
    • 2018
  • In this paper, we propose a method for the random selection of pooling operations for the regularization and reduction of cross validation in convolutional neural networks. The pooling operation in convolutional neural networks is used to reduce the size of the feature map and for its shift invariant properties. In the existing pooling method, one pooling operation is applied in each pooling layer. Because this method fixes the convolution network, the network suffers from overfitting, which means that it excessively fits the models to the training samples. In addition, to find the best combination of pooling operations to maximize the performance, cross validation must be performed. To solve these problems, we introduce the probability concept into the pooling layers. The proposed method does not select one pooling operation in each pooling layer. Instead, we randomly select one pooling operation among multiple pooling operations in each pooling region during training, and for testing purposes, we use probabilistic weighting to produce the expected output. The proposed method can be seen as a technique in which many networks are approximately averaged using a different pooling operation in each pooling region. Therefore, this method avoids the overfitting problem, as well as reducing the amount of cross validation. The experimental results show that the proposed method can achieve better generalization performance and reduce the need for cross validation.

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.47-60
    • /
    • 2010
  • Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.

  • PDF

Research on text mining based malware analysis technology using string information (문자열 정보를 활용한 텍스트 마이닝 기반 악성코드 분석 기술 연구)

  • Ha, Ji-hee;Lee, Tae-jin
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.45-55
    • /
    • 2020
  • Due to the development of information and communication technology, the number of new / variant malicious codes is increasing rapidly every year, and various types of malicious codes are spreading due to the development of Internet of things and cloud computing technology. In this paper, we propose a malware analysis method based on string information that can be used regardless of operating system environment and represents library call information related to malicious behavior. Attackers can easily create malware using existing code or by using automated authoring tools, and the generated malware operates in a similar way to existing malware. Since most of the strings that can be extracted from malicious code are composed of information closely related to malicious behavior, it is processed by weighting data features using text mining based method to extract them as effective features for malware analysis. Based on the processed data, a model is constructed using various machine learning algorithms to perform experiments on detection of malicious status and classification of malicious groups. Data has been compared and verified against all files used on Windows and Linux operating systems. The accuracy of malicious detection is about 93.5%, the accuracy of group classification is about 90%. The proposed technique has a wide range of applications because it is relatively simple, fast, and operating system independent as a single model because it is not necessary to build a model for each group when classifying malicious groups. In addition, since the string information is extracted through static analysis, it can be processed faster than the analysis method that directly executes the code.

A Comprehensive Groundwater Modeling using Multicomponent Multiphase Theory: 1. Development of a Multidimensional Finite Element Model (다중 다상이론을 이용한 통합적 지하수 모델링: 1. 다차원 유한요소 모형의 개발)

  • Joon Hyun Kim
    • Journal of Korea Soil Environment Society
    • /
    • v.1 no.1
    • /
    • pp.89-102
    • /
    • 1996
  • An integrated model is presented to describe underground flow and mass transport, using a multicomponent multiphase approach. The comprehensive governing equation is derived considering mass and force balances of chemical species over four phases(water, oil, air, and soil) in a schematic elementary volume. Compact and systemati notations of relevant variables and equations are introduced to facilitate the inclusion of complex migration and transformation processes, and variable spatial dimensions. The resulting nonlinear system is solved by a multidimensional finite element code. The developed code with dynamic array allocation, is sufficiently flexible to work across a wide spectrum of computers, including an IBM ES 9000/900 vector facility, SP2 cluster machine, Unix workstations and PCs, for one-, two and three-dimensional problems. To reduce the computation time and storage requirements, the system equations are decoupled and solved using a banded global matrix solver, with the vector and parallel processing on the IBM 9000. To avoide the numerical oscillations of the nonlinear problems in the case of convective dominant transport, the techniques of upstream weighting, mass lumping, and elementary-wise parameter evaluation are applied. The instability and convergence criteria of the nonlinear problems are studied for the one-dimensional analogue of FEM and FDM. Modeling capacity is presented in the simulation of three dimensional composite multiphase TCE migration. Comprehesive simulation feature of the code is presented in a companion paper of this issue for the specific groundwater or flow and contamination problems.

  • PDF

ESD(Exponential Standard Deviation) Band centered at Exponential Moving Average (지수이동평균을 중심으로 하는 ESD밴드)

  • Lee, Jungyoun;Hwang, Sunmyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.115-125
    • /
    • 2016
  • The Bollinger Band indicating the current price position in the recent price action range is obtained by adding/substracting the simple standard deviation (SSD) to/from the simple moving average (SMA). In this paper, we first compare the characteristics of the SMA and the exponential moving average (EMA) in the operator's point of view. A basic equation is obtained between the interval length N of the SMA operator and the weighting factor ${\rho}$ of the EMA operator, that makes the centers of the 1st order momentums of each operator impulse respoinse identical. For equivalent N and ${\rho}$, frequency response examples are obtained and compared by using the discrete time Fourier transform. Based on observation that the SMA operator reacts more excessively than the EMA operator, we propose a novel exponential standard deviation (ESD) band centered at the EMA and derive an auto recursive formula for the proposed ESD band. Practical examples for the ESD band show that it has a smoother bound on the price action range than the Bollinger Band. Comparisons are also made for the gap corrected chart to show the advantageous feature of the ESD band even in the case of gap occurrence. Trading techniques developed for the Bollinger Band can be straight forwardly applied to those for the ESD band.

Zoning Permanent Basic Farmland Based on Artificial Immune System coupling with spatial constraints

  • Hua, Wang;Mengyu, Wang;Yuxin, Zhu;Jiqiang, Niu;Xueye, Chen;Yang, Zhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1666-1689
    • /
    • 2021
  • The red line of Permanent Basic Farmland is the most important part in the "three-line" demarcation of China's national territorial development plan. The scientific and reasonable delineation of the red line is a major strategic measure being taken by China to improve its ability to safeguard the practical interests of farmers and guarantee national food security. The delineation of Permanent Basic Farmland zoning (DPBFZ) is essentially a multi-objective optimization problem. However, the traditional method of demarcation does not take into account the synergistic development goals of conservation of cultivated land utilization, ecological conservation, or urban expansion. Therefore, this research introduces the idea of artificial immune optimization and proposes a multi-objective model of DPBFZ red line delineation based on a clone selection algorithm. This research proposes an objective functional system consisting of these three sub-objectives: optimal quality of cropland, spatially concentrated distribution, and stability of cropland. It also takes into consideration constraints such as the red line of ecological protection, topography, and space for major development projects. The mathematical formal expressions for the objectives and constraints are given in the paper, and a multi-objective optimal decision model with multiple constraints for the DPBFZ problem is constructed based on the clone selection algorithm. An antibody coding scheme was designed according to the spatial pattern of DPBFZ zoning. In addition, the antibody-antigen affinity function, the clone mechanism, and mutation strategy were constructed and improved to solve the DPBFZ problem with a spatial optimization feature. Finally, Tongxu County in Henan province was selected as the study area, and a controlled experiment was set up according to different target preferences. The results show that the model proposed in this paper is operational in the work of delineating DPBFZ. It not only avoids the adverse effects of subjective factors in the delineation process but also provides multiple scenarios DPBFZ layouts for decision makers by adjusting the weighting of the objective function.