DOI QR코드

DOI QR Code

Adaptive Frequent Pattern Algorithm using CAWFP-Tree based on RHadoop Platform

RHadoop 플랫폼기반 CAWFP-Tree를 이용한 적응 빈발 패턴 알고리즘

  • Park, In-Kyu (Dept. of Game Software, College of Engineering Joongbu University)
  • 박인규 (중부대학교 게임소프트웨어학과)
  • Received : 2017.05.01
  • Accepted : 2017.06.20
  • Published : 2017.06.28

Abstract

An efficient frequent pattern algorithm is essential for mining association rules as well as many other mining tasks for convergence with its application spread over a very broad spectrum. Models for mining pattern have been proposed using a FP-tree for storing compressed information about frequent patterns. In this paper, we propose a centroid frequent pattern growth algorithm which we called "CAWFP-Growth" that enhances he FP-Growth algorithm by making the center of weights and frequencies for the itemsets. Because the conventional constraint of maximum weighted support is not necessary to maintain the downward closure property, it is more likely to reduce the search time and the information loss of the frequent patterns. The experimental results show that the proposed algorithm achieves better performance than other algorithms without scarifying the accuracy and increasing the processing time via the centroid of the items. The MapReduce framework model is provided to handle large amounts of data via a pseudo-distributed computing environment. In addition, the modeling of the proposed algorithm is required in the fully distributed mode.

Keywords

Data Mining;Weight Frequent Pattern;Centroid;Downward Closure;MapReduce

Acknowledgement

Supported by : 중부대학교

References

  1. R. Agrawal, R. Srikant, "Fast Algorithm for Mining Association Rules", In: 20th Int. Conf. on Very Large Data Bases, pp. 487-499, 1994.
  2. C. F. Ahmed, S. K. Tanbeer, B. S. Jeong, "Mining Weighted Frequent Patterns using Adaptive Weightes", In: Fyfe et al. (Eds.): IDEAL 2008, LNCS 5326, pp. 258-265, 2008.
  3. C. H. Cai, A. W. C. Fu, C. H. Cheng, W. W. Kwong, "Mining Association rules with weighted items", In Proceedings of Intl. Database Engineering and Applications Symposium (IDEAS 1988), Cardiff, Wales, UK, July pp. 68-77, 1998.
  4. F. Tao, "Weighted association rule Mining using Weighted Support and Significant Framework", In: 9th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining", pp. 661-666, 2003.
  5. W. Wang, J. Yang, P. S. Yu, "WAR: Weighted Association Rules Item Intensities", Knowledge Information and Systems, No. 6, pp. 203-229, 2003.
  6. U. Yun, J. J. Leggett, "WFIM: Weighted Frequent Itemset Mining with a wieght range and a minimum weight", Society for Industrial and Applied Maathematics, Proceedings of the 2005 SIAM International Conference on Data Mining, pp. 636-640, 2005.
  7. U. Yun, "Efficient Mining of Weighted Interesting Patterns with A Strong Weight and/or Support Affinity", Information Sciences, Vol. 177, pp. 3477-3499, 2007. https://doi.org/10.1016/j.ins.2007.03.018
  8. U. Yun, "An Efficient Mining of Weighted Frequent Patterns with Length Decreasing Support Constraints", Knowlwdge-Based Systems, Vol. 21, Issue 8, Dec., pp. 741-752, 2008. https://doi.org/10.1016/j.knosys.2008.03.059
  9. S. Zhang, C. Zhang, X. Yan, "Post-Mining: Maintenance of Association Rules by Weighting", Information Systems, Vol. 23, pp. 691-707, 2003.
  10. J. E. Shin, B. H. Jeong, D. H. Lim, "BigData Distribution System using RHadoop", Society of Data Information Science, Vol. 36, No. 5, pp. 1155-1166, 2015.
  11. H. L. Nguyen, "An Efficient Algorithm for Mining Weighted Frequent Itemsets Using Adaptive Weights", I.J. Intellogent Systems and Appillcations, Vol. 11, pp. 41-48, 2015.
  12. K. U. Jeon, M. S. Kim "Frequent Pattern Mining Technique of BigData using MapReduce Framework", Korea Information Processing Society, Vol. 21, No. 3, pp.17-25, 2014..
  13. G. W. Jin, "A Study on the Data Collection Methods based Hadoop Distributed Environment", Korea Convergence Society, Vol. 7, No. 5, pp. 1-6, 2016.
  14. Y. J. Kim, "Convergence of Business Information System Process using Knowledge-based Method", Korea Convergence Society, Vol. 6, No. 4, pp. 65-71, 2015.
  15. J. H. Gu, "A Study on the Machine Learning Model for Product Faulty Prediction in Internet of Things Environment", Convergence Society for SMB, Vol. 7, No. 1, pp. 55-60, 2017.
  16. S. Y. Hong, "New Authentication Methods based on User's Behavior Big Data Analysis on Cloud", Convergence Society for SMB, Vol. 6, No. 4, pp. 31-36, 2016.
  17. I. K. Park, "An Improvement of the Decision Making of Categorical Data in Rough Set Analysis", Journal of Digital Convergence, Vol. 13, No. 6, pp.157-164, 2015.
  18. I. K. Park, "The Generation of Control Rules for Data Mining", Journal of Digital Convergence, Vol. 11, No. 11, pp.343-349, 2013. https://doi.org/10.14400/JDPM.2013.11.11.343
  19. I. K. Park, "Clustering Algorithm for Data Mining using Posterior Probability-based Information Entropy", Journal of Digital Convergence, Vol. 12, No. 12, pp.293-301, 2014.
  20. B. R. Hwang, S. G. Kim, "On Implementing a Learning Environment for Big Data Processing using Raspberry Pi", Journal of Digital Convergence, Vol. 14, No. 4, pp.251-258, 2016.
  21. Apache Hadoop, http://hadoop.apache.org/