Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams

Kim, Young-Hee;Kim, Won-Young;Kim, Ung-Mo;

doi:10.3745/JIPS.2010.6.1.079

Journal of Information Processing Systems

Volume 6 Issue 1
/
Pages.79-90
/
2010
/
1976-913X(pISSN)
/
2092-805X(eISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams

Kim, Young-Hee (School of Information and Communication Engineering, Sungkyunkwan University) ;
Kim, Won-Young (School of Information and Communication Engineering, Sungkyunkwan University) ;
Kim, Ung-Mo (School of Information and Communication Engineering, Sungkyunkwan University)

Published : 2010.03.31

https://doi.org/10.3745/JIPS.2010.6.1.079 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. The continuous characteristic of streaming data necessitates the use of algorithms that require only one scan over the stream for knowledge discovery. Data mining over data streams should support the flexible trade-off between processing time and mining accuracy. In many application areas, mining frequent itemsets has been suggested to find important frequent itemsets by considering the weight of itemsets. In this paper, we present an efficient algorithm WSFI (Weighted Support Frequent Itemsets)-Mine with normalized weight over data streams. Moreover, we propose a novel tree structure, called the Weighted Support FP-Tree (WSFP-Tree), that stores compressed crucial information about frequent itemsets. Empirical results show that our algorithm outperforms comparative algorithms under the windowed streaming model.

Keywords

References

M.M. Gaber, et al, "Mining data streams: a review", ACM SIGMOD record 34(2), pp.18-26, 2005. https://doi.org/10.1145/1083784.1083789
J. Chang, W. Lee, "A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams", Journal of Information Science and Engineering, Vol.20, No.4, July, 2004.
G.S. Manku, R. Motwani, "Approximate Frequency Counts Over Data Streams", In Proceedings of the 28th International Conference on Very Large Data Bases, pp.346-357, 2002.
C.H. Lee, C.R. Lin, M.S. Chen, "Sliding window filtering: An efficient method for incremental mining on a time-variant database", Information Systems, 30, pp.227-244, 2005. https://doi.org/10.1016/j.is.2004.02.001
H.F Li, S.Y. Lee, M.K. Shan, "An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams", In Proceedings of First International Workshop on Knowledge Discovery in Data Streams 9IWKDDS, 2004.
H.F Li, S.Y. Lee, M.K. Shan, "Online Mining (Recently) Maximal Frequent Itemsets over Data Streams", In Proceedings of the 15th IEEE International Workshop on Research Issues on Data Engineering (RIDE), 2005. https://doi.org/10.1109/RIDE.2005.13
H. Yao, H.J. Hamilton, C.J. Butz, "A Foundational Approach to Mining Itemset Utilities from Databases", In Proceedings of the 4th SIAM International Conference on Data Mining, Florida, USA, 2004.
C.J Chu, V.S. Tseng, T. Liang, "An efficient algorithm for mining temporal high utility itemsets from data streams", The Journal of System and Software 81, pp.1105-1117, 2008. https://doi.org/10.1016/j.jss.2007.07.026
C. Giannella, J, Han, J. Pei, X. Yan, P.S. Yu, "Mining Frequent Patterns in Data Streams at Multiple Time Granularities", Next Generation Data Mining, 2003.
C.H. Cai, A.W. Fu, C.H. Cheng, W.W. Kwong, "Mining association rules with weighted items", In Proceedings of the International Database Engineering and Applications Symposium, IDEAS98, pp.68-77, Cardiff, Wales, UK, 1998. https://doi.org/10.1109/IDEAS.1998.694360
F. Tao, "Weighted association rule mining using weighted support and significant framework", In Proceedings of the 9th ACM SIGKDD, Knowledge Discovery and Data Mining, pp.661-666, 2003. https://doi.org/10.1145/956750.956836
W. Wang, J. Yang, P.S, Yu, "WAR: weighted association rules for item intensities", Knowledge Information and Systems, Vol.6, pp.203-229, 2004. https://doi.org/10.1007/s10115-003-0108-7
R. Agrawal, R. Srikant, "Fast Algorithms for Mining Association Rules", In Proceedings of the 20th VLDB conference, pp.487-499, 1994.
U. Yun, J.J. Leggett, "WFIM: weighted frequent itemset mining with a weight range and a minimum weight", In Proceedings of the 15th SIAM International Conference on Data Mining (SDM’'05), pp.636-640, 2005.
U. Yun, "Efficient Mining of weighted interesting patterns with a strong weight and/or support affinity", Information Sciences, Vol.177, pp.3477-3499, 2007. https://doi.org/10.1016/j.ins.2007.03.018
C.F. Ahmed, S.K. Tanbeer, B.S. Jeong, "Efficient Mining of Weighted Frequent Patterns Over Data Streams", 2009 11th International Conference on High Performance Computing and Communications, pp.400-406, June, Seoul, Korea, 2009. https://doi.org/10.1109/HPCC.2009.36
J. Han, J. Pei, Y. Yin, R. Mao, "Mining Frequents without Candidate Generation: A Frequent-Pattern Tree Approach", Data Mining and Knowledge Discovery, No.8, pp.53-87, 2004. https://doi.org/10.1023/B:DAMI.0000005258.31418.83

Cited by

A New Data Stream Mining Algorithm for Interestingness-Rich Association Rules vol.53, pp.3, 2013, https://doi.org/10.1080/08874417.2013.11645628
Efficient mining fuzzy association rules from ubiquitous data streams vol.54, pp.2, 2015, https://doi.org/10.1016/j.aej.2015.03.015
Performance analysis of Frequent Itemset Mining Technique based on Transaction Weight Constraints vol.16, pp.1, 2015, https://doi.org/10.7472/jksii.2015.16.1.67
Discovering Frequent Itemsets Reflected User Characteristics Using Weighted Batch based on Data Stream vol.11, pp.1, 2011, https://doi.org/10.5392/JKCA.2011.11.1.056
Driving behaviors analysis based on feature selection and statistical approach: a preliminary study pp.1573-0484, 2018, https://doi.org/10.1007/s11227-018-2618-9
Strategies for data stream mining method applied in anomaly detection pp.1573-7543, 2018, https://doi.org/10.1007/s10586-018-2835-2

Journal of Information Processing Systems

Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)