Contribution to Improve Database Classification Algorithms for Multi-Database Mining

Miloudi, Salim;Rahal, Sid Ahmed;Khiat, Salim;

doi:10.3745/JIPS.04.0075

Journal of Information Processing Systems

Volume 14 Issue 3
/
Pages.709-726
/
2018
/
1976-913X(pISSN)
/
2092-805X(eISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Contribution to Improve Database Classification Algorithms for Multi-Database Mining

Miloudi, Salim (Dept. of Computer Science, Faculty of Computer Science and Mathematics, University of Sciences and Technology-Mohamed Boudiaf (USTOMB)) ;
Rahal, Sid Ahmed (Dept. of Computer Science, Faculty of Computer Science and Mathematics, University of Sciences and Technology-Mohamed Boudiaf (USTOMB)) ;
Khiat, Salim (Dept. of Computer Science, Faculty of Computer Science and Mathematics, University of Sciences and Technology-Mohamed Boudiaf (USTOMB))

Received : 2015.11.20
Accepted : 2016.08.31
Published : 2018.06.30

https://doi.org/10.3745/JIPS.04.0075 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Database classification is an important preprocessing step for the multi-database mining (MDM). In fact, when a multi-branch company needs to explore its distributed data for decision making, it is imperative to classify these multiple databases into similar clusters before analyzing the data. To search for the best classification of a set of n databases, existing algorithms generate from 1 to ($n^2-n$)/2 candidate classifications. Although each candidate classification is included in the next one (i.e., clusters in the current classification are subsets of clusters in the next classification), existing algorithms generate each classification independently, that is, without taking into account the use of clusters from the previous classification. Consequently, existing algorithms are time consuming, especially when the number of candidate classifications increases. To overcome the latter problem, we propose in this paper an efficient approach that represents the problem of classifying the multiple databases as a problem of identifying the connected components of an undirected weighted graph. Theoretical analysis and experiments on public databases confirm the efficiency of our algorithm against existing works and that it overcomes the problem of increase in the execution time.

Keywords

References

S. Zhang and M. J. Zaki, "Mining multiple data sources: local pattern analysis," Data Mining and Knowledge Discovery, vol. 12, no. 2-3, pp. 121-125, 2006. https://doi.org/10.1007/s10618-006-0041-y
S. Zhang, X. Wu, and C. Zhang, "Multi-database mining," IEEE Computational Intelligence Bulletin, vol. 2, no. 1, pp. 5-13, 2003.
S. Zhang, C. Zhang, and X. Wu, Knowledge Discovery in Multiple Databases. New York, NY: Springer, 2004.
A. Adhikari, P. Ramachandrarao, and W. Pedrycz, Developing Multi-database Mining Applications. London: Springer, 2010.
X. Wu, C. Zhang, and S. Zhang, "Database classification for multi-database mining," Information Systems, vol. 30, no. 1, pp. 71-88, 2005. https://doi.org/10.1016/j.is.2003.10.001
H. Li, X. Hu, and Y. Zhang, "An improved database classiﬁcation algorithm for multi-database mining," in Frontiers in Algorithmics. Heidelberg: Springer, 2009, pp. 346-357.
A. Adhikari and P. R. Rao, "Efficient clustering of databases induced by local patterns," Decision Support Systems, vol. 44, no. 4, pp. 925-943, 2008. https://doi.org/10.1016/j.dss.2007.11.001
Y. Liu, D. Yuan, and Y. Cuan, "Completely clustering for multi-databases mining," Journal of Computational Information Systems, vol. 9, no. 16, pp. 6595-6602, 2013.
H. Liu, H. Lu, and J. Yao, "Identifying relevant databases for multidatabase mining," in Research and Development in Knowledge Discovery and Data Mining. Heidelberg: Springer, 1998, pp. 15-18.
H. Liu, H. Lu, and J. Yao, "Toward multi-database mining: identifying relevant databases," IEEE Transactions on Knowledge and Data Engineering, vol. 13, no. 4, pp. 541-553, 2001. https://doi.org/10.1109/69.940731
R. Agrawal and J. C. Shafer, "Parallel mining of association rules," IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 6, pp. 962-969, 1996. https://doi.org/10.1109/69.553164
R. Agrawal and R. Srikant, "Fast algorithms for mining association rules in large databases," in Proceedings of the 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, 1994, pp. 487-499.
J. Han, J. Pei, Y. Yin, and R. Mao, "Mining frequent patterns without candidate generation: a frequent-pattern tree approach," Data Mining and Knowledge Discovery, vol. 8, no. 1, pp. 53-87, 2004. https://doi.org/10.1023/B:DAMI.0000005258.31418.83
T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms. Cambridge, MA: MIT Press, 1990.

Journal of Information Processing Systems

Contribution to Improve Database Classification Algorithms for Multi-Database Mining

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)