• Title/Summary/Keyword: K means clustering

Search Result 1,111, Processing Time 0.049 seconds

Analysis of spatial mixing characteristics of water quality at the confluence using artificial intelligence (인공지능을 활용한 합류부에서 수질의 공간혼합 특성 분석)

  • Lee, Seo Gyeong;Kim, Dongsu;Kim, Kyungdong;Kim, Young Do;Lyu, Siwan
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.482-482
    • /
    • 2022
  • 하천의 합류부에서는 수질이 다른 유체가 혼합하여 합류 전과 다른 특성을 보인다. 하천의 합류부에서 수질을 효율적으로 관리하기 위해서는 수질의 공간적인 혼합 특성을 규명하는 것이 중요하다. 합류부에서 수질의 공간적인 혼합 특성을 분석하기 위해 본 연구에서는 토폴로지 데이터 분석(topological data analysis, TDA), 자기 조직화 지도(Self-Organizing Map, SOM), k-평균 알고리즘(K-means clustering algorithm) 세 가지 기법을 이용하였다. 세 가지 기법을 비교하여 어떤 알고리즘이 합류부의 수질 변화 특성을 더 뚜렷하게 나타내는지 분석하였다. 수질 변화 비교 인자들은 pH, chlorophyll, DO, Turbidity 등이 있고, 수질 인자들은 YSI를 활용해 측정하였다. 자료의 측정 지역은 낙동강과 황강이 합류하는 지역이며, 보트에 YSI 장비를 부착하고 횡단하여 측정하였다. 측정한 데이터를 R 프로그램을 통해 세 가지 기법을 적용시켜 수질 변화 비교를 분석한다. 토폴로지 데이터 분석(topological data analysis, TDA)은 거대하고 복잡한 데이터로부터 유의미한 정보를 추출하는 데 사용하고, 자기조직화지도(Self-Organizing Map, SOM) 기법은 차원 축소와 군집화를 동시에 수행한다. k-평균 알고리즘(K-means clustering algorithm) 기법은 주어진 데이터를 k개의 클러스터로 묶는 머신러닝 비지도학습에 속하는 알고리즘이다. 세 가지 방법들의 주목적은 클러스터링이다. 클러스터 분석(Cluster analysis)이란 주어진 데이터들의 특성을 고려해 동일한 성격을 가진 여러 개의 그룹으로 대상을 분류하는 데이터 마이닝의 한 방법이다. 군집화 방법들인 TDA, SOM, K-means를 이용해 합류 지역의 수질 특성들을 클러스터링하여 수질 패턴들을 분석해 하천 수질 오염을 방지할 수 있을 것이다. 본 연구에서는 토폴로지 데이터 분석(topological data analysis, TDA), 자기조직화지도(Self-Organizing Map, SOM), k-평균 알고리즘(K-means clustering algorithm) 세 가지 기법을 이용하여 합류부에서의 수질 특성을 비교하며 어떤 기법이 합류의 특성을 더욱 뚜렷하게 나타내는지 규명했다. 합류의 특성을 군집화 방법을 이용해 알게 된다면, 합류부의 수질 변화 패턴을 다른 합류 지역에서도 적용할 수 있을 것으로 기대된다.

  • PDF

The Shot Change Detection Using a Hybrid Clustering (하이브리드 클러스터링을 이용한 샷 전환 검출)

  • Lee, Ji-Hyun;Kang, Oh-Hyung;Na, Do-Won;Lee, Yang-Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.635-638
    • /
    • 2005
  • The purpose of video segmentation is to segment video sequence into shots where each shot represents a sequence of frames having the same contents, and then select key frames from each shot for indexing. There are two types of shot changes, abrupt and gradual. The major problem of shot change detection lies on the difficulty of specifying the correct threshold, which determines the performance of shot change detection. As to the clustering approach, the right number of clusters is hard to be found. Different clustering may lead to completely different results. In this thesis, we propose a video segmentation method using a color-X$^2$ intensity histogram-based fuzzy c-means clustering algorithm.

  • PDF

Priority Demand Assessment for Overseas Construction Information Using Clustering Method (클러스터링 기법을 활용한 해외건설 필요정보 우선순위 수요 조사 평가)

  • Choi, Wonyoung;Kwak, Seing-Jin
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.29 no.4
    • /
    • pp.57-68
    • /
    • 2018
  • In a situation when domestic construction market is expected to be stagnant, Overseas Information System for Construction Engineering (OVICE) is operated to support the construction SMEs that advance to the global market. In this study, we aimed to improve the quality of information service by providing direction of information provision, by comparing expert questionnaire with information system user statistics. For statistical analysis of information systems, to improve the efficiency of statistical analysis that is difficult to prioritize, K-means clustering is used for more efficient analysis. As a result, analyzing the difference between the survey results and the information system statistics, we were able to identify improvement point of information provision in the system and important contents that were not highlighted during the survey.

Topic-based Multi-document Summarization Using Non-negative Matrix Factorization and K-means (비음수 행렬 분해와 K-means를 이용한 주제기반의 다중문서요약)

  • Park, Sun;Lee, Ju-Hong
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.4
    • /
    • pp.255-264
    • /
    • 2008
  • This paper proposes a novel method using K-means and Non-negative matrix factorization (NMF) for topic -based multi-document summarization. NMF decomposes weighted term by sentence matrix into two sparse non-negative matrices: semantic feature matrix and semantic variable matrix. Obtained semantic features are comprehensible intuitively. Weighted similarity between topic and semantic features can prevent meaningless sentences that are similar to a topic from being selected. K-means clustering removes noises from sentences so that biased semantics of documents are not reflected to summaries. Besides, coherence of document summaries can be enhanced by arranging selected sentences in the order of their ranks. The experimental results show that the proposed method achieves better performance than other methods.

Speaker Identification with Estimating the Number of Cluster Based on Boundary Subtractive Clustering (경계 차감 클러스터링에 기반한 클러스터 개수 추정 화자식별)

  • Lee, Youn-Jeong;Choi, Min-Jung;Seo, Chang-Woo;Hahn, Hern-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.5
    • /
    • pp.199-206
    • /
    • 2007
  • In this paper we propose a new clustering algorithm that performs clustering the feature vectors for the speaker identification. Unlike typical clustering approaches, the proposed method performs the clustering without the initial guesses of locations of the cluster centers and a priori information about the number of clusters. Cluster centers are obtained incrementally by adding one cluster center at a time through the boundary subtractive clustering algorithm. The number of clusters is obtained from investigating the mutual relationship between clusters. The experimental results for artificial datum and TIMIT DB show the effectiveness of the proposed algorithm as compared with the conventional methods.

Mobile Gesture Recognition using Dynamic Time Warping with Localized Template (지역화된 템플릿기반 동적 시간정합을 이용한 모바일 제스처인식)

  • Choe, Bong-Whan;Min, Jun-Ki;Jo, Seong-Bae
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.482-486
    • /
    • 2010
  • Recently, gesture recognition methods based on dynamic time warping (DTW) have been actively investigated as more mobile devices have equipped the accelerometer. DTW has no additional training step since it uses given samples as the matching templates. However, it is difficult to apply the DTW on mobile environments because of its computational complexity of matching step where the input pattern has to be compared with every templates. In order to address the problem, this paper proposes a gesture recognition method based on DTW that uses localized subset of templates. Here, the k-means clustering algorithm is used to divide each class into subclasses in which the most centered sample in each subclass is employed as the localized template. It increases the recognition speed by reducing the number of matches while it minimizes the errors by preserving the diversities of the training patterns. Experimental results showed that the proposed method was about five times faster than the DTW with all training samples, and more stable than the randomly selected templates.

Proposal of a Monitoring System to Determine the Possibility of Contact with Confirmed Infectious Diseases Using K-means Clustering Algorithm and Deep Learning Based Crowd Counting (K-평균 군집화 알고리즘 및 딥러닝 기반 군중 집계를 이용한 전염병 확진자 접촉 가능성 여부 판단 모니터링 시스템 제안)

  • Lee, Dongsu;ASHIQUZZAMAN, AKM;Kim, Yeonggwang;Sin, Hye-Ju;Kim, Jinsul
    • Smart Media Journal
    • /
    • v.9 no.3
    • /
    • pp.122-129
    • /
    • 2020
  • The possibility that an asymptotic coronavirus-19 infected person around the world is not aware of his infection and can spread it to people around him is still a very important issue in that the public is not free from anxiety and fear over the spread of the epidemic. In this paper, the K-means clustering algorithm and deep learning-based crowd aggregation were proposed to determine the possibility of contact with confirmed cases of infectious diseases. As a result of 300 iterations of all input learning images, the PSNR value was 21.51, and the final MAE value for the entire data set was 67.984. This means the average absolute error between observations and the average absolute error of fewer than 4,000 people in each CCTV scene, including the calculation of the distance and infection rate from the confirmed patient and the surrounding persons, the net group of potential patient movements, and the prediction of the infection rate.

A Study on the Clustering method for Analysis of Zeus Botnet Attack Types in the Cloud Environment (클라우드 환경에서 제우스 Botnet 공격 유형 분석을 위한 클러스터링 방안 연구)

  • Bae, Won-il;Choi, Suk-June;Kim, Seong-Jin;Kim, Hyeong-Cheon;Kwak, Jin
    • Journal of Internet Computing and Services
    • /
    • v.18 no.1
    • /
    • pp.11-20
    • /
    • 2017
  • Recently, developments in the various fields of cloud computing technology has been utilized. Whereas the demand for cloud computing services is increasing, security threats are also increasing in the cloud computing environments. Especially, in case when the hosts interconnected in the cloud environments are infected and propagated through the attacks by malware. It can have an effect on the resource of other hosts and other security threats such as personal information can be spreaded and data deletion. Therefore, the study of malware analysis to respond these security threats has been proceeded actively. This paper proposes a type of attack clustering method of Zeus botnet using the k-means clustering algorithm for malware analysis that occurs in the cloud environments. By clustering the malicious activity by a type of the Zeus botnet occurred in the cloud environments. it is possible to determine whether it is a malware or not. In the future, it sets a goal of responding to an attack of the new type of Zeus botnet that may occur in the cloud environments.

Document Clustering using Clustering and Wikipedi (군집과 위키피디아를 이용한 문서군집)

  • Park, Sun;Lee, Seong Ho;Park, Hee Man;Kim, Won Ju;Kim, Dong Jin;Chandra, Abel;Lee, Seong Ro
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.392-393
    • /
    • 2012
  • This paper proposes a new document clustering method using clustering and Wikipedia. The proposed method can well represent the concept of cluster topics by means of NMF. It can solve the problem of "bags of words" to be not considered the meaningful relationships between documents and clusters, which expands the important terms of cluster by using of the synonyms of Wikipedia. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

  • PDF

Improved FCM Clustering Image Segmentation (개선된 FCM 클러스터링 영상 분할)

  • Lee, Kwang-Kyug
    • Journal of IKEEE
    • /
    • v.24 no.1
    • /
    • pp.127-131
    • /
    • 2020
  • Fuzzy C-Means(FCM) algorithm is frequently used as a representative image segmentation method using clustering. FCM divides the image space into cluster regions with similar pixel values, which requires a lot of segmentation time. In particular, the processing speed problem for analyzing various patterns of the current users of the web is more important. To solve this speed problem, this paper proposes an improved FCM (Improved FCM : IFCM) algorithm for segmenting the image into the Otsu threshold and FCM. In the proposed method, the threshold that maximizes the variance between classes of Otsu is determined, applied to the FCM, and the image is segmented. Experiments show that IFCM improves performance by shortening image segmentation time compared to conventional FCM.