• Title/Summary/Keyword: k-평균 클러스터링

Search Result 110, Processing Time 0.023 seconds

Analysis of deep learning-based deep clustering method (딥러닝 기반의 딥 클러스터링 방법에 대한 분석)

  • Hyun Kwon;Jun Lee
    • Convergence Security Journal
    • /
    • v.23 no.4
    • /
    • pp.61-70
    • /
    • 2023
  • Clustering is an unsupervised learning method that involves grouping data based on features such as distance metrics, using data without known labels or ground truth values. This method has the advantage of being applicable to various types of data, including images, text, and audio, without the need for labeling. Traditional clustering techniques involve applying dimensionality reduction methods or extracting specific features to perform clustering. However, with the advancement of deep learning models, research on deep clustering techniques using techniques such as autoencoders and generative adversarial networks, which represent input data as latent vectors, has emerged. In this study, we propose a deep clustering technique based on deep learning. In this approach, we use an autoencoder to transform the input data into latent vectors, and then construct a vector space according to the cluster structure and perform k-means clustering. We conducted experiments using the MNIST and Fashion-MNIST datasets in the PyTorch machine learning library as the experimental environment. The model used is a convolutional neural network-based autoencoder model. The experimental results show an accuracy of 89.42% for MNIST and 56.64% for Fashion-MNIST when k is set to 10.

Performance Improvement of Document Classification by Rule-based Word Clustering (규칙기반 단어 클러스터링에 의한 문서 분류의 성능 향상)

  • Hyun Woo-Seok
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06b
    • /
    • pp.196-198
    • /
    • 2006
  • 분류되지 않은 문서의 문서 분류는 현재까지 아주 중요한 문제로 대두되고 있다. 컴퓨터를 이용한 문서 검색 엔진인 Citeseer에서는 문서 인덱싱을 하기 위해서 자동문서 분류 방법을 사용하고 있다. 문서 분류는 원본 문서의 단어들을 제1의 속성 표현으로 사용한다. 그러나 이와 같은 표현은 고차원과 속성 부족을 초래하게 된다. 단어 클러스터링은 속성 차원과 속성 부족을 감소시키기 위한 효율적인 방법이며 문서 분류 성능을 향상시켜 준다. 본 연구에서는 클러스터 속성 표현을 위한 도메인 규칙기반 단어 클러스터링 방법을 사용한다. 클러스터는 다양한 도메인 데이터베이스들과 단어 철자 속성들로부터 생성되는데, 이와 같은 클러스터 속성 표현은 중요한 차원 감소뿐만 아니라 문서 헤더 라인의 평균 분류 성능에서 향상을 보여 주었고, 원본 문서 단어 기반 속성 표현과 비교해 보았을 때 도서목록 항목 추출의 정확도를 향상시켰다.

  • PDF

An Empirical Comparison and Verification Study on the Containerports Clustering Measurement Using K-Means and Hierarchical Clustering(Average Linkage Method Using Cross-Efficiency Metrics, and Ward Method) and Mixed Models (K-Means 군집모형과 계층적 군집(교차효율성 메트릭스에 의한 평균연결법, Ward법)모형 및 혼합모형을 이용한 컨테이너항만의 클러스터링 측정에 대한 실증적 비교 및 검증에 관한 연구)

  • Park, Ro-Kyung
    • Journal of Korea Port Economic Association
    • /
    • v.34 no.3
    • /
    • pp.17-52
    • /
    • 2018
  • The purpose of this paper is to measure the clustering change and analyze empirical results. Additionally, by using k-means, hierarchical, and mixed models on Asian container ports over the period 2006-2015, the study aims to form a cluster comprising Busan, Incheon, and Gwangyang ports. The models consider the number of cranes, depth, birth length, and total area as inputs and container twenty-foot equivalent units(TEU) as output. Following are the main empirical results. First, ranking order according to the increasing ratio during the 10 years analysis shows that the value for average linkage(AL), mixed ward, rule of thumb(RT)& elbow, ward, and mixed AL are 42.04% up, 35.01% up, 30.47%up, and 23.65% up, respectively. Second, according to the RT and elbow models, the three Korean ports can be clustered with Asian ports in the following manner: Busan Port(Hong Kong, Guangzhou, Qingdao, and Singapore), Incheon Port(Tokyo, Nagoya, Osaka, Manila, and Bangkok), and Gwangyang Port(Gungzhou, Ningbo, Qingdao, and Kasiung). Third, optimal clustering numbers are as follows: AL(6), Mixed Ward(5), RT&elbow(4), Ward(5), and Mixed AL(6). Fourth, empirical clustering results match with those of questionnaire-Busan Port(80%), Incheon Port(17%), and Gwangyang Port(50%). The policy implication is that related parties of Korean seaports should introduce port improvement plans like the benchmarking of clustered seaports.

Incremental Clustering Algorithm by Modulating Vigilance Parameter Dynamically (경계변수 값의 동적인 변경을 이용한 점층적 클러스터링 알고리즘)

  • 신광철;한상용
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.11
    • /
    • pp.1072-1079
    • /
    • 2003
  • This study is purported for suggesting a new clustering algorithm that enables incremental categorization of numerous documents. The suggested algorithm adopts the natures of the spherical k-means algorithm, which clusters a mass amount of high-dimensional documents, and the fuzzy ART(adaptive resonance theory) neural network, which performs clustering incrementally. In short, the suggested algorithm is a combination of the spherical k-means vector space model and concept vector and fuzzy ART vigilance parameter. The new algorithm not only supports incremental clustering and automatically sets the appropriate number of clusters, but also solves the current problems of overfitting caused by outlier and noise. Additionally, concerning the objective function value, which measures the cluster's coherence that is used to evaluate the quality of produced clusters, tests on the CLASSIC3 data set showed that the newly suggested algorithm works better than the spherical k-means by 8.04% in average.

Smart elevator operation management mobile application design using clustering techniques (클러스터링 기법을 이용한 스마트 엘리베이터 운영관리 모바일 어플리케이션 설계)

  • Park, Hung-Bog;Son, Gwan-yeong;Choi, Geum-kang;Hwang, You-kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.10a
    • /
    • pp.661-662
    • /
    • 2016
  • It's being a trend that contemporary buildings are getting higher. And in line with this, we suggest the smart elevator operation management mobile application which is designed using clustering techniques to improve user's elevator access control and convenience. This clustering technique using the elevator's calling data make it possible to find its best position, and know the state of it using a smart phone as well. With this, not only the users' waiting time can be decreased, but also their convenience can be improved with the remote control system that got efficient a lot.

  • PDF

Context-awareness User Analysis based on Clustering Algorithm (클러스터링 알고리즘기반의 상황인식 사용자 분석)

  • Lee, Kang-whan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.7
    • /
    • pp.942-948
    • /
    • 2020
  • In this paper, we propose a clustered algorithm that possible more efficient user distinction within clustering using context-aware attribute information. In typically, the data provided to classify interrelationships within cluster information in the process of clustering data will be as a degrade factor if new or newly processing information is treated as contaminated information in comparative information. In this paper, we have developed a clustering algorithm that can extract user's recognition information to solve this problem in using K-means algorithm. The proposed algorithm analyzes the user's clustering attributed parameters from user clusters using accumulated information and clustering according to their attributes. The results of the simulation with the proposed algorithm showed that the user management system was more adaptable in terms of classifying and maintaining multiple users in clusters.

Analysis of COVID-19 Context-awareness based on Clustering Algorithm (클러스터링 알고리즘기반의 COVID-19 상황인식 분석)

  • Lee, Kangwhan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.755-762
    • /
    • 2022
  • This paper propose a clustered algorithm that possible more efficient COVID-19 disease learning prediction within clustering using context-aware attribute information. In typically, clustering of COVID-19 diseases provides to classify interrelationships within disease cluster information in the clustering process. The clustering data will be as a degrade factor if new or newly processing information during treated as contaminated factors in comparative interrelationships information. In this paper, we have shown the solving the problems and developed a clustering algorithm that can extracting disease correlation information in using K-means algorithm. According to their attributes from disease clusters using accumulated information and interrelationships clustering, the proposed algorithm analyzes the disease correlation clustering possible and centering points. The proposed algorithm showed improved adaptability to prediction accuracy of the classification management system in terms of learning as a group of multiple disease attribute information of COVID-19 through the applied simulation results.

Automatic Classification Algorithm for Raw Materials using Mean Shift Clustering and Stepwise Region Merging in Color (컬러 영상에서 평균 이동 클러스터링과 단계별 영역 병합을 이용한 자동 원료 분류 알고리즘)

  • Kim, SangJun;Kwak, JoonYoung;Ko, ByoungChul
    • Journal of Broadcast Engineering
    • /
    • v.21 no.3
    • /
    • pp.425-435
    • /
    • 2016
  • In this paper, we propose a classification model by analyzing raw material images recorded using a color CCD camera to automatically classify good and defective agricultural products such as rice, coffee, and green tea, and raw materials. The current classifying agricultural products mainly depends on visual selection by skilled laborers. However, classification ability may drop owing to repeated labor for a long period of time. To resolve the problems of existing human dependant commercial products, we propose a vision based automatic raw material classification combining mean shift clustering and stepwise region merging algorithm. In this paper, the image is divided into N cluster regions by applying the mean-shift clustering algorithm to the foreground map image. Second, the representative regions among the N cluster regions are selected and stepwise region-merging method is applied to integrate similar cluster regions by comparing both color and positional proximity to neighboring regions. The merged raw material objects thereby are expressed in a 2D color distribution of RG, GB, and BR. Third, a threshold is used to detect good and defective products based on color distribution ellipse for merged material objects. From the results of carrying out an experiment with diverse raw material images using the proposed method, less artificial manipulation by the user is required compared to existing clustering and commercial methods, and classification accuracy on raw materials is improved.

Automatic Left Ventricle Segmentation Algorithm using K-mean Clustering and Graph Searching on Cardiac MRI (K-평균 클러스터링과 그래프 탐색을 통한 심장 자기공명영상의 좌심실 자동분할 알고리즘)

  • Jo, Hyun-Wu;Lee, Hae-Yeoun
    • The KIPS Transactions:PartB
    • /
    • v.18B no.2
    • /
    • pp.57-66
    • /
    • 2011
  • To prevent cardiac diseases, quantifying cardiac function is important in routine clinical practice by analyzing blood volume and ejection fraction. These works have been manually performed and hence it requires computational costs and varies depending on the operator. In this paper, an automatic left ventricle segmentation algorithm is presented to segment left ventricle on cardiac magnetic resonance images. After coil sensitivity of MRI images is compensated, a K-mean clustering scheme is applied to segment blood area. A graph searching scheme is employed to correct the segmentation error from coil distortions and noises. Using cardiac MRI images from 38 subjects, the presented algorithm is performed to calculate blood volume and ejection fraction and compared with those of manual contouring by experts and GE MASS software. Based on the results, the presented algorithm achieves the average accuracy of 6.2mL${\pm}$5.6, 2.9mL${\pm}$3.0 and 2.1%${\pm}$1.5 in diastolic phase, systolic phase and ejection fraction, respectively. Moreover, the presented algorithm minimizes user intervention rates which was critical to automatize algorithms in previous researches.

EM Algorithm based Clustering Method for Internet of Things (IoT) Service (EM 알고리즘을 이용한 사물 인터넷 서비스 클러스터링 기법)

  • Jang, June-Beom;Jo, Jeong-Hoon;Lee, Daewon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.1315-1317
    • /
    • 2017
  • 다양한 IoT(사물인터넷) 서비스가 등장하고 수요가 많아짐에 따라 이를 통합적으로 관리하고 제어하는 통합 서비스 플랫폼에 관한 여구가 활발하게 진행되고 있다. 하지만 서비스의 표준 부재로 인하여 IoT 서비스 모듈의 재활용 및 이식은 불가능한 상황이다. 이러한 문제를 해결하기 위하여 본 연구에서는 IoT 서비스의 각 동작 단계에 EM 알고리즘을 적용하여 [1]의 동작기반 분류 기법을 확장한다. 제안한 EM 기반 IoT 서비스 분류 알고리즘은 서비스 유사도를 기반하여 분류 함으로 모듈의 재활용성을 높이고 서비스 간의 협업에 있어서 효율성 증대를 기대할 수 있다. 성능 평가를 통하여 평균에 대한 표준편차로 클러스터링되는 것을 확인 할 수 있다.