Comprehensive review on Clustering Techniques and its application on High Dimensional Data

Alam, Afroj;Muqeem, Mohd;Ahmad, Sultan;

doi:10.22937/IJCSNS.2021.21.6.31

International Journal of Computer Science & Network Security

Volume 21 Issue 6
/
Pages.237-244
/
2021
/
1738-7906(pISSN)

International Journal of Computer Science & Network Security (국제컴퓨터통신보호논문지학회)

DOI QR Code

Comprehensive review on Clustering Techniques and its application on High Dimensional Data

Alam, Afroj (Department of Computer Application Integral University) ;
Muqeem, Mohd (Department of Computer Application Integral University) ;
Ahmad, Sultan (Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University)

Received : 2021.06.05
Published : 2021.06.30

https://doi.org/10.22937/IJCSNS.2021.21.6.31 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Clustering is a most powerful un-supervised machine learning techniques for division of instances into homogenous group, which is called cluster. This Clustering is mainly used for generating a good quality of cluster through which we can discover hidden patterns and knowledge from the large datasets. It has huge application in different field like in medicine field, healthcare, gene-expression, image processing, agriculture, fraud detection, profitability analysis etc. The goal of this paper is to explore both hierarchical as well as partitioning clustering and understanding their problem with various approaches for their solution. Among different clustering K-means is better than other clustering due to its linear time complexity. Further this paper also focused on data mining that dealing with high-dimensional datasets with their problems and their existing approaches for their relevancy

Keywords

Acknowledgement

The authors would like to thank the Deanship of Scientific Research at Prince Sattam Bin Abdulaziz University, Alkharj, Saudi Arabia for the assistance.

References

Guha, S., Rastogi, R., & Shim, K. (1998). CURE: An efficient clustering algorithm for large databases. ACM Sigmod record, 27(2), 73-84. https://doi.org/10.1145/276305.276312
Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O. P., Tiwari, A., ... & Lin, C. T. (2017). A review of clustering techniques and developments. Neurocomputing, 267, 664-681. https://doi.org/10.1016/j.neucom.2017.06.053
Bansal, A., Sharma, M., & Goel, S. (2017). Improved Kmean clustering algorithm for prediction analysis using classification technique in data mining. International Journal of Computer Applications, 157(6), 0975-8887.
Pavithra, M., & Parvathi, R. M. S. (2017). A survey on clustering high dimensional data techniques. International Journal of Applied Engineering Research, 12(11), 2893-2899.
Han, J.,Pie, J., & Kamber, M. (2010). Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2010.
Fraley, C., & Raftery, A. E. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The computer journal, 41(8), 578-588. https://doi.org/10.1093/comjnl/41.8.578
Cohen-Addad, V., Kanade, V., Mallmann-Trenn, F., & Mathieu, C. (2019). Hierarchical clustering: Objective functions and algorithms. Journal of the ACM (JACM), 66(4), 1-42.
Murtagh, F., & Contreras, P. (2017). Algorithms for hierarchical clustering: an overview, II. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 7(6), e1219.
Bouguettaya, A., Yu, Q., Liu, X., Zhou, X., & Song, A. (2015). Efficient agglomerative hierarchical clustering. Expert Systems with Applications, 42(5), 2785-2797. https://doi.org/10.1016/j.eswa.2014.09.054
Pandove, D., Goel, S., & Rani, R. (2018). Systematic review of clustering high-dimensional and large datasets. ACM Transactions on Knowledge Discovery from Data (TKDD), 12(2), 1-68. https://doi.org/10.1145/3132088
Kameshwaran, K., & Malarvizhi, K. (2014). Survey on clustering techniques in data mining. International Journal of Computer Science and Information Technologies, 5(2), 2272-2276.
Popat, S. K., & Emmanuel, M. (2014). Review and comparative study of clustering techniques. International journal of computer science and information technologies, 5(1), 805-812.
Shakeel, P. M., Baskar, S., Dhulipala, V. S., & Jaber, M. M. (2018). Cloud based framework for diagnosis of diabetes mellitus using K-means clustering. Health information science and systems, 6(1), 1-7. https://doi.org/10.1007/s13755-017-0038-5
Mohammed, N. N., & Abdulazeez, A. M. (2017, June). Evaluation of partitioning around medoids algorithm with various distances on microarray data. In 2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) (pp. 1011-1016). IEEE.
Elavarasi, S. A., Akilandeswari, J., & Sathiyabhama, B. (2011). A survey on partition clustering algorithms. International Journal of Enterprise Computing and Business Systems, 1(1).
Makwana, T. M., & Prashant, R. (2013). Partitioning Clustering algorithms for handling numerical and categorical data: a review. arXiv preprint arXiv:1311.7219.
Shah, M., & Nair, S. (2015). A survey of data mining clustering algorithms. International Journal of Computer Applications, 128(1), 1-5. https://doi.org/10.5120/ijca2015906404
Zafar, M. H., & Ilyas, M. (2015). A clustering based study of classification algorithms. International journal of database theory and application, 8(1), 11-22. https://doi.org/10.14257/ijdta.2015.8.1.02
Agrawal, R., Gehrke, J., Gunopulos, D., & Raghavan, P. (2005). Automatic subspace clustering of high dimensional data. Data Mining and Knowledge Discovery, 11(1), 5-33. https://doi.org/10.1007/s10618-005-1396-1
Ding, C., He, X., Zha, H., & Simon, H. D. (2002, December). Adaptive dimension reduction for clustering high dimensional data. In 2002 IEEE International Conference on Data Mining, 2002. Proceedings. (pp. 147-154). IEEE.
Pandove, D., Goel, S., & Rani, R. (2018). Systematic review of clustering high-dimensional and large datasets. ACM Transactions on Knowledge Discovery from Data (TKDD), 12(2), 1-68 https://doi.org/10.1145/3132088
Khanmohammadi, S., Adibeig, N., & Shanehbandy, S. (2017). An improved overlapping k-means clustering method for medical applications. Expert Systems with Applications, 67, 12-18. https://doi.org/10.1016/j.eswa.2016.09.025
Fu, X., Zeng, X. J., Feng, P., & Cai, X. (2018). Clustering-based short-term load forecasting for residential electricity under the increasing-block pricing tariffs in China. Energy, 165, 76-89. https://doi.org/10.1016/j.energy.2018.09.156
Nanda, S. J., & Panda, G. (2014). A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm and Evolutionary computation, 16, 1-18. https://doi.org/10.1016/j.swevo.2013.11.003
Torabi, M., Hashemi, S., Saybani, M. R., Shamshirband, S., & Mosavi, A. (2019). A Hybrid clustering and classification technique for forecasting short-term energy consumption. Environmental progress & sustainable energy, 38(1), 66-76. https://doi.org/10.1002/ep.12934
Fraley, C., & Raftery, A. E. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The computer journal, 41(8), 578-588. https://doi.org/10.1093/comjnl/41.8.578
Sneath, P. H., & Sokal, R. R. (1973). Numerical taxonomy. The principles and practice of numerical classification.
Murtagh, F. (1983). A survey of recent advances in hierarchical clustering algorithms. The computer journal, 26(4), 354-359. https://doi.org/10.1093/comjnl/26.4.354
Assent, I. (2012). Clustering high dimensional data. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(4), 340-350. https://doi.org/10.1002/widm.1062
A. E. M. Eljialy, Sultan Ahmad,"Errors Detection Mechanism in Big Data",IEEE, Second International Conference on Smart Systems and Inventive Technology (ICSSIT 2019) on 27-29 November, 2019

International Journal of Computer Science & Network Security

Comprehensive review on Clustering Techniques and its application on High Dimensional Data

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)