• Title/Summary/Keyword: Sequential Pattern

Search Result 361, Processing Time 0.027 seconds

WIS: Weighted Interesting Sequential Pattern Mining with a Similar Level of Support and/or Weight

  • Yun, Un-Il
    • ETRI Journal
    • /
    • v.29 no.3
    • /
    • pp.336-352
    • /
    • 2007
  • Sequential pattern mining has become an essential task with broad applications. Most sequential pattern mining algorithms use a minimum support threshold to prune the combinatorial search space. This strategy provides basic pruning; however, it cannot mine correlated sequential patterns with similar support and/or weight levels. If the minimum support is low, many spurious patterns having items with different support levels are found; if the minimum support is high, meaningful sequential patterns with low support levels may be missed. We present a new algorithm, weighted interesting sequential (WIS) pattern mining based on a pattern growth method in which new measures, sequential s-confidence and w-confidence, are suggested. Using these measures, weighted interesting sequential patterns with similar levels of support and/or weight are mined. The WIS algorithm gives a balance between the measures of support and weight, and considers correlation between items within sequential patterns. A performance analysis shows that WIS is efficient and scalable in weighted sequential pattern mining.

  • PDF

Tree-based Navigation Pattern Analysis

  • Choi, Hyun-Jip
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.1
    • /
    • pp.271-279
    • /
    • 2001
  • Sequential pattern discovery is one of main interests in web usage mining. the technique of sequential pattern discovery attempts to find inter-session patterns such that the presence of a set of items is followed by another item in a time-ordered set of server sessions. In this paper, a tree-based sequential pattern finding method is proposed in order to discover navigation patterns in server sessions. At each learning process, the suggested method learns about the navigation patterns per server session and summarized into the modified Rymon's tree.

  • PDF

A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases

  • Ahmed, Chowdhury Farhan;Tanbeer, Syed Khairuzzaman;Jeong, Byeong-Soo
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.676-686
    • /
    • 2010
  • Mining sequential patterns is an important research issue in data mining and knowledge discovery with broad applications. However, the existing sequential pattern mining approaches consider only binary frequency values of items in sequences and equal importance/significance values of distinct items. Therefore, they are not applicable to actually represent many real-world scenarios. In this paper, we propose a novel framework for mining high-utility sequential patterns for more real-life applicable information extraction from sequence databases with non-binary frequency values of items in sequences and different importance/significance values for distinct items. Moreover, for mining high-utility sequential patterns, we propose two new algorithms: UtilityLevel is a high-utility sequential pattern mining with a level-wise candidate generation approach, and UtilitySpan is a high-utility sequential pattern mining with a pattern growth approach. Extensive performance analyses show that our algorithms are very efficient and scalable for mining high-utility sequential patterns.

IMPLEMENTATION OF SUBSEQUENCE MAPPING METHOD FOR SEQUENTIAL PATTERN MINING

  • Trang, Nguyen Thu;Lee, Bum-Ju;Lee, Heon-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.627-630
    • /
    • 2006
  • Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.

  • PDF

Implementation of Subsequence Mapping Method for Sequential Pattern Mining

  • Trang Nguyen Thu;Lee Bum-Ju;Lee Heon-Gyu;Park Jeong-Seok;Ryu Keun-Ho
    • Korean Journal of Remote Sensing
    • /
    • v.22 no.5
    • /
    • pp.457-462
    • /
    • 2006
  • Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.

Sequential Pattern Mining with Optimization Calling MapReduce Function on MapReduce Framework (맵리듀스 프레임웍 상에서 맵리듀스 함수 호출을 최적화하는 순차 패턴 마이닝 기법)

  • Kim, Jin-Hyun;Shim, Kyu-Seok
    • The KIPS Transactions:PartD
    • /
    • v.18D no.2
    • /
    • pp.81-88
    • /
    • 2011
  • Sequential pattern mining that determines frequent patterns appearing in a given set of sequences is an important data mining problem with broad applications. For example, sequential pattern mining can find the web access patterns, customer's purchase patterns and DNA sequences related with specific disease. In this paper, we develop the sequential pattern mining algorithms using MapReduce framework. Our algorithms distribute input data to several machines and find frequent sequential patterns in parallel. With synthetic data sets, we did a comprehensive performance study with varying various parameters. Our experimental results show that linear speed up can be achieved through our algorithms with increasing the number of used machines.

Finding Weighted Sequential Patterns over Data Streams via a Gap-based Weighting Approach (발생 간격 기반 가중치 부여 기법을 활용한 데이터 스트림에서 가중치 순차패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.55-75
    • /
    • 2010
  • Sequential pattern mining aims to discover interesting sequential patterns in a sequence database, and it is one of the essential data mining tasks widely used in various application fields such as Web access pattern analysis, customer purchase pattern analysis, and DNA sequence analysis. In general sequential pattern mining, only the generation order of data element in a sequence is considered, so that it can easily find simple sequential patterns, but has a limit to find more interesting sequential patterns being widely used in real world applications. One of the essential research topics to compensate the limit is a topic of weighted sequential pattern mining. In weighted sequential pattern mining, not only the generation order of data element but also its weight is considered to get more interesting sequential patterns. In recent, data has been increasingly taking the form of continuous data streams rather than finite stored data sets in various application fields, the database research community has begun focusing its attention on processing over data streams. The data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. In data stream processing, each data element should be examined at most once to analyze the data stream, and the memory usage for data stream analysis should be restricted finitely although new data elements are continuously generated in a data stream. Moreover, newly generated data elements should be processed as fast as possible to produce the up-to-date analysis result of a data stream, so that it can be instantly utilized upon request. To satisfy these requirements, data stream processing sacrifices the correctness of its analysis result by allowing some error. Considering the changes in the form of data generated in real world application fields, many researches have been actively performed to find various kinds of knowledge embedded in data streams. They mainly focus on efficient mining of frequent itemsets and sequential patterns over data streams, which have been proven to be useful in conventional data mining for a finite data set. In addition, mining algorithms have also been proposed to efficiently reflect the changes of data streams over time into their mining results. However, they have been targeting on finding naively interesting patterns such as frequent patterns and simple sequential patterns, which are found intuitively, taking no interest in mining novel interesting patterns that express the characteristics of target data streams better. Therefore, it can be a valuable research topic in the field of mining data streams to define novel interesting patterns and develop a mining method finding the novel patterns, which will be effectively used to analyze recent data streams. This paper proposes a gap-based weighting approach for a sequential pattern and amining method of weighted sequential patterns over sequence data streams via the weighting approach. A gap-based weight of a sequential pattern can be computed from the gaps of data elements in the sequential pattern without any pre-defined weight information. That is, in the approach, the gaps of data elements in each sequential pattern as well as their generation orders are used to get the weight of the sequential pattern, therefore it can help to get more interesting and useful sequential patterns. Recently most of computer application fields generate data as a form of data streams rather than a finite data set. Considering the change of data, the proposed method is mainly focus on sequence data streams.

A Fusion of Data Mining Techniques for Predicting Movement of Mobile Users

  • Duong, Thuy Van T.;Tran, Dinh Que
    • Journal of Communications and Networks
    • /
    • v.17 no.6
    • /
    • pp.568-581
    • /
    • 2015
  • Predicting locations of users with portable devices such as IP phones, smart-phones, iPads and iPods in public wireless local area networks (WLANs) plays a crucial role in location management and network resource allocation. Many techniques in machine learning and data mining, such as sequential pattern mining and clustering, have been widely used. However, these approaches have two deficiencies. First, because they are based on profiles of individual mobility behaviors, a sequential pattern technique may fail to predict new users or users with movement on novel paths. Second, using similar mobility behaviors in a cluster for predicting the movement of users may cause significant degradation in accuracy owing to indistinguishable regular movement and random movement. In this paper, we propose a novel fusion technique that utilizes mobility rules discovered from multiple similar users by combining clustering and sequential pattern mining. The proposed technique with two algorithms, named the clustering-based-sequential-pattern-mining (CSPM) and sequential-pattern-mining-based-clustering (SPMC), can deal with the lack of information in a personal profile and avoid some noise due to random movements by users. Experimental results show that our approach outperforms existing approaches in terms of efficiency and prediction accuracy.

Sequential pattern load modeling and warning-system plan in modular falsework

  • Peng, Jui-Lin;Wu, Cheng-Lung;Chan, Siu-Lai
    • Structural Engineering and Mechanics
    • /
    • v.16 no.4
    • /
    • pp.441-468
    • /
    • 2003
  • This paper investigates the structural behavior of modular falsework system under sequential pattern loads. Based on the studies of 25 construction sites, the pattern load sequence modeling is defined as models R (rectangle), L and U. The study focuses on the system critical loads, regions of largest reaction forces, discrepancy between the pattern load and the uniform load, and the warning-system plan. The analysis results show that the critical loads of modular falsework systems with sequential pattern loads are very close to those with the uniform load used in design. The regions of largest reaction forces are smaller than those calculated by the uniform load. However, the regions of largest reaction forces of three models under sequential pattern loads can be considered as the crucial positions of warning-system based on the measured index of loading. The positions of the sensors for the warning-system for these three different models are not identical.

Searching Sequential Patterns by Approximation Algorithm (근사 알고리즘을 이용한 순차패턴 탐색)

  • Sarlsarbold, Garawagchaa;Hwang, Young-Sup
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.5
    • /
    • pp.29-36
    • /
    • 2009
  • Sequential pattern mining, which discovers frequent subsequences as patterns in a sequence database, is an important data mining problem with broad applications. Since a sequential pattern in DNA sequences can be a motif, we studied to find sequential patterns in DNA sequences. Most previously proposed mining algorithms follow the exact matching with a sequential pattern definition. They are not able to work in noisy environments and inaccurate data in practice. Theses problems occurs frequently in DNA sequences which is a biological data. We investigated approximate matching method to deal with those cases. Our idea is based on the observation that all occurrences of a frequent pattern can be classified into groups, which we call approximated pattern. The existing PrefixSpan algorithm can successfully find sequential patterns in a long sequence. We improved the PrefixSpan algorithm to find approximate sequential patterns. The experimental results showed that the number of repeats from the proposed method was 5 times more than that of PrefixSpan when the pattern length is 4.