IMPLEMENTATION OF SUBSEQUENCE MAPPING METHOD FOR SEQUENTIAL PATTERN MINING

  • Trang, Nguyen Thu (Database and Bioinformatics Laboratory, Chungbuk National University) ;
  • Lee, Bum-Ju (Database and Bioinformatics Laboratory, Chungbuk National University) ;
  • Lee, Heon-Gyu (Database and Bioinformatics Laboratory, Chungbuk National University) ;
  • Ryu, Keun-Ho (Database and Bioinformatics Laboratory, Chungbuk National University)
  • Published : 2006.11.02

Abstract

Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.

Keywords