Search | Korea Science

Sequential Pattern Mining for Intrusion Detection System with Feature Selection on Big Data

Fidalcastro, A;Baburaj, E
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.10
- /
- pp.5023-5038
- /
- 2017
Big data is an emerging technology which deals with wide range of data sets with sizes beyond the ability to work with software tools which is commonly used for processing of data. When we consider a huge network, we have to process a large amount of network information generated, which consists of both normal and abnormal activity logs in large volume of multi-dimensional data. Intrusion Detection System (IDS) is required to monitor the network and to detect the malicious nodes and activities in the network. Massive amount of data makes it difficult to detect threats and attacks. Sequential Pattern mining may be used to identify the patterns of malicious activities which have been an emerging popular trend due to the consideration of quantities, profits and time orders of item. Here we propose a sequential pattern mining algorithm with fuzzy logic feature selection and fuzzy weighted support for huge volumes of network logs to be implemented in Apache Hadoop YARN, which solves the problem of speed and time constraints. Fuzzy logic feature selection selects important features from the feature set. Fuzzy weighted supports provide weights to the inputs and avoid multiple scans. In our simulation we use the attack log from NS-2 MANET environment and compare the proposed algorithm with the state-of-the-art sequential Pattern Mining algorithm, SPADE and Support Vector Machine with Hadoop environment.
https://doi.org/10.3837/tiis.2017.10.018 인용 PDF KSCI

Trend-based Sequential Pattern Discovery from Time-Series Data (시계열 데이터로부터의 경향성 기반 순차패턴 탐색)

오용생;이동하;남도원;이전영
- Journal of Intelligence and Information Systems
- /
- v.7 no.1
- /
- pp.27-45
- /
- 2001
Sequential discovery from time series data has mainly concerned about events or item sets. Recently, the research has stated to applied to the numerical data. An example is sensor information generated by checking a machine state. The numerical data hardly have the same valuers while making patterns. So, it is important to extract suitable number of pattern features, which can be transformed to events or item sets and be applied to sequential pattern mining tasks. The popular methods to extract the patterns are sliding window and clustering. The results of these methods are sensitive to window sine or clustering parameters; that makes users to apply data mining task repeatedly and to interpret the results. This paper suggests the method to retrieve pattern features making numerical data into vector of an angle and a magnitude. The retrieved pattern features using this method make the result easy to understand and sequential patterns finding fast. We define an inclusion relation among pattern features using angles and magnitudes of vectors. Using this relation, we can fad sequential patterns faster than other methods, which use all data by reducing the data size.
PDF

A Sequential Monte Carlo inference for longitudinal data with luespotted mud hopper data (짱뚱어 자료로 살펴본 장기 시계열 자료의 순차적 몬테 칼로 추론)

Choi, Il-Su
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.9 no.6
- /
- pp.1341-1345
- /
- 2005
Sequential Monte Carlo techniques are a set of powerful and versatile simulation-based methods to perform optimal state estimation in nonlinear non-Gaussian state-space models. We can use Monte Carlo particle filters adaptively, i.e. so that they simultaneously estimate the parameters and the signal. However, Sequential Monte Carlo techniques require the use of special panicle filtering techniques which suffer from several drawbacks. We consider here an alternative approach combining particle filtering and Sequential Hybrid Monte Carlo. We give some examples of applications in fisheries(luespotted mud hopper data).
PDF KSCI

WIS: Weighted Interesting Sequential Pattern Mining with a Similar Level of Support and/or Weight

Yun, Un-Il
- ETRI Journal
- /
- v.29 no.3
- /
- pp.336-352
- /
- 2007
Sequential pattern mining has become an essential task with broad applications. Most sequential pattern mining algorithms use a minimum support threshold to prune the combinatorial search space. This strategy provides basic pruning; however, it cannot mine correlated sequential patterns with similar support and/or weight levels. If the minimum support is low, many spurious patterns having items with different support levels are found; if the minimum support is high, meaningful sequential patterns with low support levels may be missed. We present a new algorithm, weighted interesting sequential (WIS) pattern mining based on a pattern growth method in which new measures, sequential s-confidence and w-confidence, are suggested. Using these measures, weighted interesting sequential patterns with similar levels of support and/or weight are mined. The WIS algorithm gives a balance between the measures of support and weight, and considers correlation between items within sequential patterns. A performance analysis shows that WIS is efficient and scalable in weighted sequential pattern mining.
PDF

Associative Memory Model for Time Series Data (시계열정보 처리를 위한 연상기억 모델)

박철영
- Journal of Korea Society of Industrial Information Systems
- /
- v.6 no.3
- /
- pp.29-34
- /
- 2001
In this paper, a new associative memory system for analog time-sequential data processing is proposed. This system effectively associate time-sequential data using not only matching with present data but also matching with past data. Furthermore in order to improve error correction ability, weight varying in time domain is introduced in this system. The network is simulated with several periodic time-sequential input patterns including noise. The results show that the proposed system has ability to correct input errors. We expect that the proposed system may be applied for a real time processing of analog time-sequential information.
PDF

Enhanced Robust Cooperative Spectrum Sensing in Cognitive Radio

Zhu, Feng;Seo, Seung-Woo
- Journal of Communications and Networks
- /
- v.11 no.2
- /
- pp.122-133
- /
- 2009
As wireless spectrum resources become more scarce while some portions of frequency bands suffer from low utilization, the design of cognitive radio (CR) has recently been urged, which allows opportunistic usage of licensed bands for secondary users without interference with primary users. Spectrum sensing is fundamental for a secondary user to find a specific available spectrum hole. Cooperative spectrum sensing is more accurate and more widely used since it obtains helpful reports from nodes in different locations. However, if some nodes are compromised and report false sensing data to the fusion center on purpose, the accuracy of decisions made by the fusion center can be heavily impaired. Weighted sequential probability ratio test (WSPRT), based on a credit evaluation system to restrict damage caused by malicious nodes, was proposed to address such a spectrum sensing data falsification (SSDF) attack at the price of introducing four times more sampling numbers. In this paper, we propose two new schemes, named enhanced weighted sequential probability ratio test (EWSPRT) and enhanced weighted sequential zero/one test (EWSZOT), which are robust against SSDF attack. By incorporating a new weight module and a new test module, both schemes have much less sampling numbers than WSPRT. Simulation results show that when holding comparable error rates, the numbers of EWSPRT and EWSZOT are 40% and 75% lower than WSPRT, respectively. We also provide theoretical analysis models to support the performance improvement estimates of the new schemes.
PDF KSCI

Efficiency and Minimaxity of Bayes Sequential Procedures in Simple versus Simple Hypothesis Testing for General Nonregular Models

Hyun Sook Oh;Anirban DasGupta
- Journal of the Korean Statistical Society
- /
- v.25 no.1
- /
- pp.95-110
- /
- 1996
We consider the question of efficiency of the Bayes sequential procedure with respect to the optimal fixed sample size Bayes procedure in a simple vs. simple testing problem for data coming from a general nonregular density b(.theta.)h(x)l(x < .theta.). Efficiency is defined in two different ways in these caiculations. Also, the minimax sequential risk (and minimax sequential stratage) is studied as a function of the cost of sampling.
PDF

Learning Multidimensional Sequential Patterns Using Hellinger Entropy Function (Hellinger 엔트로피를 이용한 다차원 연속패턴의 생성방법)

Lee, Chang-Hwan
- The KIPS Transactions:PartB
- /
- v.11B no.4
- /
- pp.477-484
- /
- 2004
The technique of sequential pattern mining means generating a set of inter-transaction patterns residing in time-dependent data. This paper proposes a new method for generating sequential patterns with the use of Hellinger measure. While the current methods are generating single dimensional sequential patterns within a single attribute, the proposed method is able to detect multi-dimensional patterns among different attributes. A number of heuristics, based on the characteristics of Hellinger measure, are proposed to reduce the computational complexity of the sequential pattern systems. Some experimental results are presented.
https://doi.org/10.3745/KIPSTB.2004.11B.4.477 인용 PDF KSCI

An Efficient Mining Algorithm for Generating Probabilistic Multidimensional Sequential Patterns (확률적 다차원 연속패턴의 생성을 위한 효율적인 마이닝 알고리즘)

Lee Chang-Hwan
- Journal of KIISE:Software and Applications
- /
- v.32 no.2
- /
- pp.75-84
- /
- 2005
Sequential pattern mining is an important data mining problem with broad applications. While the current methods are generating sequential patterns within a single attribute, the proposed method is able to detect them among different attributes. By incorporating these additional attributes, the sequential patterns found are richer and more informative to the user This paper proposes a new method for generating multi-dimensional sequential patterns with the use of Hellinger entropy measure. Unlike the Previously used methods, the proposed method can calculate the significance of each sequential pattern. Two theorems are proposed to reduce the computational complexity of the proposed system. The proposed method is tested on some synthesized purchase transaction databases.
PDF KSCI

Parallel and Sequential Implementation to Minimize the Time for Data Transmission Using Steiner Trees

Anand, V.;Sairam, N.
- Journal of Information Processing Systems
- /
- v.13 no.1
- /
- pp.104-113
- /
- 2017
In this paper, we present an approach to transmit data from the source to the destination through a minimal path (least-cost path) in a computer network of n nodes. The motivation behind our approach is to address the problem of finding a minimal path between the source and destination. From the work we have studied, we found that a Steiner tree with bounded Steiner vertices offers a good solution. A novel algorithm to construct a Steiner tree with vertices and bounded Steiner vertices is proposed in this paper. The algorithm finds a path from each source to each destination at a minimum cost and minimum number of Steiner vertices. We propose both the sequential and parallel versions. We also conducted a comparative study of sequential and parallel versions based on time complexity, which proved that parallel implementation is more efficient than sequential.
https://doi.org/10.3745/JIPS.03.0061 인용 PDF KSCI

Search Result 1,105, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)