Discovering Frequent Itemsets Reflected User Characteristics Using Weighted Batch based on Data Stream

스트림 데이터 환경에서 배치 가중치를 이용하여 사용자 특성을 반영한 빈발항목 집합 탐사

  • 서복일 (전남대학교 전자컴퓨터공학부) ;
  • 김재인 (전남대학교 전자컴퓨터공학부) ;
  • 황부현 (전남대학교 전자컴퓨터공학부)
  • Received : 2010.12.06
  • Accepted : 2011.01.03
  • Published : 2011.01.28


It is difficult to discover frequent itemsets based on whole data from data stream since data stream has the characteristics of infinity and continuity. Therefore, a specialized data mining method, which reflects the properties of data and the requirement of users, is required. In this paper, we propose the method of FIMWB discovering the frequent itemsets which are reflecting the property that the recent events are more important than old events. Data stream is splitted into batches according to the given time interval. Our method gives a weighted value to each batch. It reflects user's interestedness for recent events. FP-Digraph discovers the frequent itemsets by using the result of FIMWB. Experimental result shows that FIMWB can reduce the generation of useless items and FP-Digraph method shows that it is suitable for real-time environment in comparison to a method based on a tree(FP-Tree).


Supported by : 한국연구재단


  1. R. Agrawal, T. Imielinski, and A. Swami, "Mining association rules between sets of items in large databases," pp.207-216, in Proc.ACM SIGMOD 1993.
  2. J. Chang and W. Lee, "A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams," Journal of Information Science and Engineering, Vol.20, No4, 2004(7).
  3. M. M. Gaber, "Mining data streams:a review," ACM SIGMOD record, Vol.34, No.2, pp.18-26, 2005.
  4. Chowdhury Farhan Ahmed, Syed Khairuzzaman Tanbeer, and B. S. Jeong, "Efficient Mining of Weighted Frequent Patterns Over Data Streams," 11th IEEE International Conference on High Performance Computing and Communications, 2009(6).
  5. C. F.ahmed, S. K. Tanbeer, and B. S. Jeong, "Efficient Mining of Weighted Frequent Patterns Over Data Streams," 2009 11th International Conference on High Performance Computing and Communications, pp.400-406, June, Seoul, Korea, 2009.
  6. Y. Kim, W. Kim, and U. Kim, "Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams," Journal of Information Processing Systems, Vol.6, No.1, 2010(3).
  7. Carson Kai-Sang Leung, and Boyu Hao, "Mining of Frequent Itemsets from Streams of Uncertain Data," IEEE International Conference on Data Engineering, 2010(5).
  8. Vivek Tiwari, Vipin Tiwari, Shailendra Gupta, and Renu Tiwari, "Association Rule Mining: A Graph Based Approach for Mining Frequent Itemsets", IEEE International Conference on Networking and Information Technology, 2010(7).
  9. J. Pei, J. Han, B. M. Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. Hsu, "Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach," IEEE Transactions on Knowledge and Data Engineering, Vol.16, No.11, 2004.
  10. G. Chen, X. Wu, and X. Zhu, "Mining Sequential Patterns Across Data Streams," Univ. of nd montComputerScience Technical Report(CS-05-04), 2005.
  11. LTC Bruce D.Caulkins, and J.Leem M.Wang, "A Dynamic Data Mining Technique for Intrusion Detection Systems," 43rd ACM Southeast Conference, March 18-20, 2005, Kennesaw, GA, USA.