• Title/Summary/Keyword: clustering techniques

Search Result 528, Processing Time 0.026 seconds

Analysis of deep learning-based deep clustering method (딥러닝 기반의 딥 클러스터링 방법에 대한 분석)

  • Hyun Kwon;Jun Lee
    • Convergence Security Journal
    • /
    • v.23 no.4
    • /
    • pp.61-70
    • /
    • 2023
  • Clustering is an unsupervised learning method that involves grouping data based on features such as distance metrics, using data without known labels or ground truth values. This method has the advantage of being applicable to various types of data, including images, text, and audio, without the need for labeling. Traditional clustering techniques involve applying dimensionality reduction methods or extracting specific features to perform clustering. However, with the advancement of deep learning models, research on deep clustering techniques using techniques such as autoencoders and generative adversarial networks, which represent input data as latent vectors, has emerged. In this study, we propose a deep clustering technique based on deep learning. In this approach, we use an autoencoder to transform the input data into latent vectors, and then construct a vector space according to the cluster structure and perform k-means clustering. We conducted experiments using the MNIST and Fashion-MNIST datasets in the PyTorch machine learning library as the experimental environment. The model used is a convolutional neural network-based autoencoder model. The experimental results show an accuracy of 89.42% for MNIST and 56.64% for Fashion-MNIST when k is set to 10.

Design of Fuzzy System with Hierarchical Classifying Structures and its Application to Time Series Prediction (계층적 분류구조의 퍼지시스템 설계 및 시계열 예측 응용)

  • Bang, Young-Keun;Lee, Chul-Heui
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.5
    • /
    • pp.595-602
    • /
    • 2009
  • Fuzzy rules, which represent the behavior of their system, are sensitive to fuzzy clustering techniques. If the classification abilities of such clustering techniques are improved, their systems can work for the purpose more accurately because the capabilities of the fuzzy rules and parameters are enhanced by the clustering techniques. Thus, this paper proposes a new hierarchically structured clustering algorithm that can enhance the classification abilities. The proposed clustering technique consists of two clusters based on correlationship and statistical characteristics between data, which can perform classification more accurately. In addition, this paper uses difference data sets to reflect the patterns and regularities of the original data clearly, and constructs multiple fuzzy systems to consider various characteristics of the differences suitably. To verify effectiveness of the proposed techniques, this paper applies the constructed fuzzy systems to the field of time series prediction, and performs prediction for nonlinear time series examples.

Empirical Comparisons of Clustering Algorithms using Silhouette Information

  • Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.1
    • /
    • pp.31-36
    • /
    • 2010
  • Many clustering algorithms have been used in diverse fields. When we need to group given data set into clusters, many clustering algorithms based on similarity or distance measures are considered. Most clustering works have been based on hierarchical and non-hierarchical clustering algorithms. Generally, for the clustering works, researchers have used clustering algorithms case by case from these algorithms. Also they have to determine proper clustering methods subjectively by their prior knowledge. In this paper, to solve the subjective problem of clustering we make empirical comparisons of popular clustering algorithms which are hierarchical and non hierarchical techniques using Silhouette measure. We use silhouette information to evaluate the clustering results such as the number of clusters and cluster variance. We verify our comparison study by experimental results using data sets from UCI machine learning repository. Therefore we are able to use efficient and objective clustering algorithms.

Evaluation of Subtractive Clustering based Adaptive Neuro-Fuzzy Inference System with Fuzzy C-Means based ANFIS System in Diagnosis of Alzheimer

  • Kour, Haneet;Manhas, Jatinder;Sharma, Vinod
    • Journal of Multimedia Information System
    • /
    • v.6 no.2
    • /
    • pp.87-90
    • /
    • 2019
  • Machine learning techniques have been applied in almost all the domains of human life to aid and enhance the problem solving capabilities of the system. The field of medical science has improved to a greater extent with the advent and application of these techniques. Efficient expert systems using various soft computing techniques like artificial neural network, Fuzzy Logic, Genetic algorithm, Hybrid system, etc. are being developed to equip medical practitioner with better and effective diagnosing capabilities. In this paper, a comparative study to evaluate the predictive performance of subtractive clustering based ANFIS hybrid system (SCANFIS) with Fuzzy C-Means (FCM) based ANFIS system (FCMANFIS) for Alzheimer disease (AD) has been taken. To evaluate the performance of these two systems, three parameters i.e. root mean square error (RMSE), prediction accuracy and precision are implemented. Experimental results demonstrated that the FCMANFIS model produce better results when compared to SCANFIS model in predictive analysis of Alzheimer disease (AD).

A Computational Intelligence Based Online Data Imputation Method: An Application For Banking

  • Nishanth, Kancherla Jonah;Ravi, Vadlamani
    • Journal of Information Processing Systems
    • /
    • v.9 no.4
    • /
    • pp.633-650
    • /
    • 2013
  • All the imputation techniques proposed so far in literature for data imputation are offline techniques as they require a number of iterations to learn the characteristics of data during training and they also consume a lot of computational time. Hence, these techniques are not suitable for applications that require the imputation to be performed on demand and near real-time. The paper proposes a computational intelligence based architecture for online data imputation and extended versions of an existing offline data imputation method as well. The proposed online imputation technique has 2 stages. In stage 1, Evolving Clustering Method (ECM) is used to replace the missing values with cluster centers, as part of the local learning strategy. Stage 2 refines the resultant approximate values using a General Regression Neural Network (GRNN) as part of the global approximation strategy. We also propose extended versions of an existing offline imputation technique. The offline imputation techniques employ K-Means or K-Medoids and Multi Layer Perceptron (MLP)or GRNN in Stage-1and Stage-2respectively. Several experiments were conducted on 8benchmark datasets and 4 bank related datasets to assess the effectiveness of the proposed online and offline imputation techniques. In terms of Mean Absolute Percentage Error (MAPE), the results indicate that the difference between the proposed best offline imputation method viz., K-Medoids+GRNN and the proposed online imputation method viz., ECM+GRNN is statistically insignificant at a 1% level of significance. Consequently, the proposed online technique, being less expensive and faster, can be employed for imputation instead of the existing and proposed offline imputation techniques. This is the significant outcome of the study. Furthermore, GRNN in stage-2 uniformly reduced MAPE values in both offline and online imputation methods on all datasets.

Clustering Algorithm to Equalize the Energy Consumption of Neighboring Node with Sink in Wireless Sensor Networks (무선 센서 네트워크에서 싱크 노드와 인접한 노드의 균등한 에너지 소모를 위한 클러스터링 알고리즘)

  • Jung, Jin-Wook;Jin, Kyo-Hong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.465-468
    • /
    • 2008
  • Clustering techniques in wireless sensor networks is developed to minimize the energy consumption of node, show the effect that increases the network lifetime. Existing clustering techniques proposed the method that increases the network lifetime equalizing each node's the energy consumption by rotating the role of CH(Cluster Head), but these algorithm did not present the resolution that minimizes the energy consumption of neighboring nodes with sink. In this paper, we propose the clustering algorithm that prolongs the network lifetime by not including a part of nodes in POS(Personal Operating Space) of the sink in a cluster and communicating with sink directly to reduce the energy consumption of CH closed to sink.

  • PDF

Exploration of Hierarchical Techniques for Clustering Korean Author Names (한글 저자명 군집화를 위한 계층적 기법 비교)

  • Kang, In-Su
    • Journal of Information Management
    • /
    • v.40 no.2
    • /
    • pp.95-115
    • /
    • 2009
  • Author resolution is to disambiguate same-name author occurrences into real individuals. For this, pair-wise author similarities are computed for author name entities, and then clustering is performed. So far, many studies have employed hierarchical clustering techniques for author disambiguation. However, various hierarchical clustering methods have not been sufficiently investigated. This study covers an empirical evaluation and analysis of hierarchical clustering applied to Korean author resolution, using multiple distance functions such as Dice coefficient, Cosine similarity, Euclidean distance, Jaccard coefficient, Pearson correlation coefficient.

Clustering Algorithm to Equalize the Energy Consumption of Neighboring Node on Sink in Wireless Sensor Networks (무선 센서 네트워크에서 싱크노드와 인접한 노드의 균등한 에너지 소모를 위한 클러스터링 알고리즘)

  • Jung, Jin-Wook;Jin, Kyo-Hong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.6
    • /
    • pp.1107-1112
    • /
    • 2008
  • Clustering techniques, which are algorithm to increase the network lifetime in wireless sensor networks, is developed to minimize the energy consumption of nodes. Existing clustering techniques by to increase the network lifetime with equalizing each node's the energy consumption by rotating the role of CH(Cluster Head), but these algorithms did not present the solution that minimizes the energy consumption of neighboring nodes with sink. In this paper, we propose the clustering algorithm that prolongs the network lifetime by not including a part of nodes in POS(Personal Operating Space) of the sink in a cluster and communicating with sink directly to reduce the energy consumption of CH closed to sink.

Robustness Analysis of a Novel Model-Based Recommendation Algorithms in Privacy Environment

  • Ihsan Gunes
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.5
    • /
    • pp.1341-1368
    • /
    • 2024
  • The concept of privacy-preserving collaborative filtering (PPCF) has been gaining significant attention. Due to the fact that model-based recommendation methods with privacy are more efficient online, privacy-preserving memory-based scheme should be avoided in favor of model-based recommendation methods with privacy. Several studies in the current literature have examined ant colony clustering algorithms that are based on non-privacy collaborative filtering schemes. Nevertheless, the literature does not contain any studies that consider privacy in the context of ant colony clustering-based CF schema. This study employed the ant colony clustering model-based PPCF scheme. Attacks like shilling or profile injection could potentially be successful against privacy-preserving model-based collaborative filtering techniques. Afterwards, the scheme's robustness was assessed by conducting a shilling attack using six different attack models. We utilize masked data-based profile injection attacks against a privacy-preserving ant colony clustering-based prediction algorithm. Subsequently, we conduct extensive experiments utilizing authentic data to assess its robustness against profile injection attacks. In addition, we evaluate the resilience of the ant colony clustering model-based PPCF against shilling attacks by comparing it to established PPCF memory and model-based prediction techniques. The empirical findings indicate that push attack models exerted a substantial influence on the predictions, whereas nuke attack models demonstrated limited efficacy.

Machine Learning-Based Transactions Anomaly Prediction for Enhanced IoT Blockchain Network Security and Performance

  • Nor Fadzilah Abdullah;Ammar Riadh Kairaldeen;Asma Abu-Samah;Rosdiadee Nordin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.7
    • /
    • pp.1986-2009
    • /
    • 2024
  • The integration of blockchain technology with the rapid growth of Internet of Things (IoT) devices has enabled secure and decentralised data exchange. However, security vulnerabilities and performance limitations remain significant challenges in IoT blockchain networks. This work proposes a novel approach that combines transaction representation and machine learning techniques to address these challenges. Various clustering techniques, including k-means, DBSCAN, Gaussian Mixture Models (GMM), and Hierarchical clustering, were employed to effectively group unlabelled transaction data based on their intrinsic characteristics. Anomaly transaction prediction models based on classifiers were then developed using the labelled data. Performance metrics such as accuracy, precision, recall, and F1-measure were used to identify the minority class representing specious transactions or security threats. The classifiers were also evaluated on their performance using balanced and unbalanced data. Compared to unbalanced data, balanced data resulted in an overall average improvement of approximately 15.85% in accuracy, 88.76% in precision, 60% in recall, and 74.36% in F1-score. This demonstrates the effectiveness of each classifier as a robust classifier with consistently better predictive performance across various evaluation metrics. Moreover, the k-means and GMM clustering techniques outperformed other techniques in identifying security threats, underscoring the importance of appropriate feature selection and clustering methods. The findings have practical implications for reinforcing security and efficiency in real-world IoT blockchain networks, paving the way for future investigations and advancements.