An Efficient Algorithm for Mining Frequent Closed Itemsets Using Transaction Link Structure

트랜잭션 연결 구조를 이용한 빈발 Closed 항목집합 마이닝 알고리즘

  • Han, Kyong Rok (Department of Industrial Engineering, Hanyang University) ;
  • Kim, Jae Yearn (Department of Industrial Engineering, Hanyang University)
  • 한경록 (한양대학교 산업공학과) ;
  • 김재련 (한양대학교 산업공학과)
  • Published : 2006.09.30

Abstract

Data mining is the exploration and analysis of huge amounts of data to discover meaningful patterns. One of the most important data mining problems is association rule mining. Recent studies of mining association rules have proposed a closure mechanism. It is no longer necessary to mine the set of all of the frequent itemsets and their association rules. Rather, it is sufficient to mine the frequent closed itemsets and their corresponding rules. In the past, a number of algorithms for mining frequent closed itemsets have been based on items. In this paper, we use the transaction itself for mining frequent closed itemsets. An efficient algorithm is proposed that is based on a link structure between transactions. Our experimental results show that our algorithm is faster than previously proposed methods. Furthermore, our approach is significantly more efficient for dense databases.

Keywords

References

  1. Agrawal, R. and Srikant, R. (1994), Fast Algorithms for Mining Association Rules, In Proc. ofthe 20th lnt'l Conference on Very Large Databases, Santiago, Chile, September, 487-499
  2. Agrawal, R., Imielinski, T., and Swami, A. (1993), Mining Association Rules between Sets ofltems in Large Databases, In Proc. of the ACM SIGMOD Int'l Conference on Management of Data, Washington D.C., May, 207-216
  3. Bing, L.. Wynne. H., and Yiming, M. (1998), Integrating Classification and Association Rule Mining, In Proc. of 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD-98). New York, USA
  4. Brin, S., Mptwani, R., Ullman, J. D., and Tsur, S. (1997), Dynamic iternset counting and implication rules for market basket data, Proc. of the ACM SIGMOD Conference, May, 255-264
  5. Chen, M. S., Han, J., and Yu. P. S. (1996), Data Mining ; An Overview from a Database Perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6), 866-883 https://doi.org/10.1109/69.553155
  6. Fabrizio, A., Giovambattista, l., and Luigi, P. (2004), On the complexity of inducing categorical and quantitative association rules, Theoretical Computer Science, 314, Issues 1-2, 25 February, 217-249 https://doi.org/10.1016/j.tcs.2003.12.017
  7. Han, J. and Fu, Y. (1995), Discovery of Multiple-Level Association Rules from Large Databases, In Proc. of 1995 Int'l Conf. on Very Large Data Bases (VLDB'95), Zurich, Switzerland, September, 420-431
  8. Han, J. and Kamber, M. (2001), Data Mining : Concepts and Techniques, Morgan Kaufmann Publishers, San Francisco, CA
  9. Han, J., Pei, J., Yin, Y., and Mao, R. (2004), Mining Frequent Patterns without Candidate Generation; A Frequent-Pattern Tree Approach, Data Mining and Knowledge Discovery, 8(1), 53-87 https://doi.org/10.1023/B:DAMI.0000005258.31418.83
  10. Hong, T. P., Kuo, C. S., and Chi, S. C. (1999), Mining association rules from quantitative data, Intelligent Data Analysis, 3, Issue 5, November, 363-376 https://doi.org/10.1016/S1088-467X(99)00028-1
  11. Hsu, P. L., Lai, R., and Chiu, C. C. (2003), The hybrid of association rule algorithms and genetic algorithms for tree induction: an example of predicting the student course performance, Expert Systems with Applications, 25, Issue 1, July, 51-62 https://doi.org/10.1016/S0957-4174(03)00005-8
  12. Jiuyong, L., Hong, S., and Rodney, T. (2002), Mining the optimal class association rule set, Knowledge-Based Systems, 15, Issue 7, 1 September, 399-405 https://doi.org/10.1016/S0950-7051(02)00024-2
  13. Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., and Verkamo, A. I. (1994), Finding interesting rules from large sets of discovered association rules, In Proc. 3rd Int'l Conf. on Information and Knowledge Management, Gaithersberg, Maryland, Nov., 401-408
  14. Lee, C. H., Kim, Y. H., and Rhee, P. K. (2001), Web personalization expert with combining collaborative filtering and association rule mining technique, Expert Systems with Applications, 21, Issue 3, October, 131-137 https://doi.org/10.1016/S0957-4174(01)00034-3
  15. Michael, J. A. B. and Gordon, L. (1997), Data Mining Techniques, WILEY, U.S.A
  16. Mohammed, J. Z. and Karam, G. (2003), Fast Vertical Mining Using Diffsets, 9th International Conference on Knowledge Discovery and Data Mining, Washington, DC, August
  17. Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999a), Closed Set Based Discovery of Small Covers for Association Rules, Proc. BOA conf., October, 361-381
  18. Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999b), Discovering Frequent Closed Item sets for Association Rules, In Proc. 7th Int. Conf. Database Theory (lCDT'99). Jerusalem, Israel, January, 398-416
  19. Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999c), Efficient mining of association rules using closed itemset lattices, Information Systems, 24(1), 25-46 https://doi.org/10.1016/S0306-4379(99)00003-4
  20. Pasquier, N., Bastidc, Y., Taouil, R., and Lakhal, L. (2000), Mining Minimal Non-Redundant Association Rules using Frequent Closed ltcmsets, Proc. 1st conf. on Computational Logic, LNCS 1861, July, 972-986
  21. Pei, J., Han, J., and Mao, R. (2000), CLOSET: An Efficient Algorithm for Mining Frequent Closed Iternsets, In Proc. 2000 ACM-SIGMOD Int. Workshop on Data Mining and Knowledge Discovery (DMKD'00), Dallas, TX, May
  22. Pieter. A. and Dolf, Z. (1996), DATA MlNING, Addison-Wesley, Harlow. U.K
  23. Rajeev, R. and Shim, K. S. (2001), Mining optimized support rules for numeric attributes, Information Systems, 26, Issue 6, September, 425-444 https://doi.org/10.1016/S0306-4379(01)00026-6
  24. Silverstein, C., Brin, S., and Motwani, R. (1998), Beyond market baskets : Generalizing association rules to dependence rules, Data Mining and Knowledge Discovery, January, 2(1), 39-68 https://doi.org/10.1023/A:1009713703947
  25. Srikant, R. and Agrawal, R. (1995), Mining Generalized Association Rules, In Proc. of the 21 st Int'l Conference on Very Large Databases, Zurich, Switzerland, September, 407-419
  26. Srikant, R. and Agrawal, R. (1996), Mining Quantitative Association Rules in Large Relational Tables, Proc. of the ACM SIGMOD Conference on Management of Data, Montreal, Canada, June
  27. Srikant, R., Vu, Q., and Agrawal, R. (1997), Mining Association Rules with Item Constraints, In Proc. of the 3rd lnt'l Conference on Knowledge Discovery in Databases and Data Mining, Newport Beach, California, August, 67-73
  28. Takeshi, F., Yasuhiko, M., Shinichi, M., and Takeshi, T. (1999), Mining Optimized Association Rules for Numeric Attributes, Journal of Computer and System Sciences, 58, Issue 1, February, 1-12 https://doi.org/10.1006/jcss.1998.1595
  29. Tan, P. N., Steinbach, M., and Kumar, V. (2006), INTRODUCTION TO DATA MINING, Addison Wesley
  30. Tsay, Y. J. and Chien, Y. W. C. (2004), An efficient cluster and decomposition algorithm for mining association rules, Information Sciences, 160, Issues 1-4, 22 March, 161-171 https://doi.org/10.1016/j.ins.2003.08.013
  31. Wang, J., Han, J., and Pei, J. (2003), CLOSET+ : Searching for the Best Strategies for Mining Frequent Closed Itemsets, Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), August
  32. Yan, X. and Han, J. (2003), CloseGraph : Mining Closed Frequent Graph Patterns, Proc. 2003 ACM SIGKDD Int. Conf on Knowledge Discovery and Data Mining (KDD'03), August
  33. Zaki, M. J. and Hsiao, C. (1999), Charm : An efficient algorithm for closed association rule mining, In Technical Report 99-10, Computer Science Dept., Rensselaer Polytechnic Institute, October
  34. http://www.ics.uci.edu/~mlearn/MLRepository.html
  35. http://www.almaden.ibm.com/software/projects/iis/hdb/Projects/data_mining/mining.shtml