• Title/Summary/Keyword: k-mean clustering

Search Result 282, Processing Time 0.027 seconds

Differential Evolution with Numerous Strategies (수많은 전략을 가진 차등 진화)

  • Oh, Suk-Kyong;Shin, Seong-Yoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.01a
    • /
    • pp.243-244
    • /
    • 2020
  • 본 논문에서는 SIM(Soft Island Model)을 통해 소집단 정보를 이동시키기 위한 KSDE라고 하는 수많은 전략을 제안한다. 먼저, 전체 모집단은 k- 평균 군집 알고리즘에 의해 k 개의 하위 모집단으로 분리된다. 둘째, 소집단에 돌연변이 조작을 수행하기 위해 전략 풀에서 돌연변이 전략을 무작위로 선택한다. 마지막으로, 이 알고리즘의 모집단 다양성을 개선하기 위해 하위 집단 정보가 SIM을 통해 마이그레이션 된다.

  • PDF

Automatic Left Ventricle Segmentation Algorithm using K-mean Clustering and Graph Searching on Cardiac MRI (K-평균 클러스터링과 그래프 탐색을 통한 심장 자기공명영상의 좌심실 자동분할 알고리즘)

  • Jo, Hyun-Wu;Lee, Hae-Yeoun
    • The KIPS Transactions:PartB
    • /
    • v.18B no.2
    • /
    • pp.57-66
    • /
    • 2011
  • To prevent cardiac diseases, quantifying cardiac function is important in routine clinical practice by analyzing blood volume and ejection fraction. These works have been manually performed and hence it requires computational costs and varies depending on the operator. In this paper, an automatic left ventricle segmentation algorithm is presented to segment left ventricle on cardiac magnetic resonance images. After coil sensitivity of MRI images is compensated, a K-mean clustering scheme is applied to segment blood area. A graph searching scheme is employed to correct the segmentation error from coil distortions and noises. Using cardiac MRI images from 38 subjects, the presented algorithm is performed to calculate blood volume and ejection fraction and compared with those of manual contouring by experts and GE MASS software. Based on the results, the presented algorithm achieves the average accuracy of 6.2mL${\pm}$5.6, 2.9mL${\pm}$3.0 and 2.1%${\pm}$1.5 in diastolic phase, systolic phase and ejection fraction, respectively. Moreover, the presented algorithm minimizes user intervention rates which was critical to automatize algorithms in previous researches.

Clustering Weather Data for Study of Local Distinction (기상자료 군집화를 통한 지형적 특성 연구)

  • Kim, Min-Jin;Lee, Il-Byeong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06c
    • /
    • pp.412-415
    • /
    • 2008
  • 매일 쏟아져 나오는 방대한 양의 기상자료는 현재의 대기상태를 대표하기도 하지만 그 지역의 지형적 특성을 나타내고 있다. 이번 연구는 수원지역의 일일 기상자료를 토대로 지형적 특성과 그에 따른 기상현상(바람, 안개)알고자 한다. K-means를 이용 특정 기상현상끼리 군집화하여 지형적 특성과 비교하였다.

  • PDF

Finding Genes Discriminating Smokers from Non-smokers by Applying a Growing Self-organizing Clustering Method to Large Airway Epithelium Cell Microarray Data

  • Shahdoust, Maryam;Hajizadeh, Ebrahim;Mozdarani, Hossein;Chehrei, Ali
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.1
    • /
    • pp.111-116
    • /
    • 2013
  • Background: Cigarette smoking is the major risk factor for development of lung cancer. Identification of effects of tobacco on airway gene expression may provide insight into the causes. This research aimed to compare gene expression of large airway epithelium cells in normal smokers (n=13) and non-smokers (n=9) in order to find genes which discriminate the two groups and assess cigarette smoking effects on large airway epithelium cells.Materials and Methods: Genes discriminating smokers from non-smokers were identified by applying a neural network clustering method, growing self-organizing maps (GSOM), to microarray data according to class discrimination scores. An index was computed based on differentiation between each mean of gene expression in the two groups. This clustering approach provided the possibility of comparing thousands of genes simultaneously. Results: The applied approach compared the mean of 7,129 genes in smokers and non-smokers simultaneously and classified the genes of large airway epithelium cells which had differently expressed in smokers comparing with non-smokers. Seven genes were identified which had the highest different expression in smokers compared with the non-smokers group: NQO1, H19, ALDH3A1, AKR1C1, ABHD2, GPX2 and ADH7. Most (NQO1, ALDH3A1, AKR1C1, H19 and GPX2) are known to be clinically notable in lung cancer studies. Furthermore, statistical discriminate analysis showed that these genes could classify samples in smokers and non-smokers correctly with 100% accuracy. With the performed GSOM map, other nodes with high average discriminate scores included genes with alterations strongly related to the lung cancer such as AKR1C3, CYP1B1, UCHL1 and AKR1B10. Conclusions: This clustering by comparing expression of thousands of genes at the same time revealed alteration in normal smokers. Most of the identified genes were strongly relevant to lung cancer in the existing literature. The genes may be utilized to identify smokers with increased risk for lung cancer. A large sample study is now recommended to determine relations between the genes ABHD2 and ADH7 and smoking.

Text Extraction in HIS Color Space by Weighting Scheme

  • Le, Thi Khue Van;Lee, Gueesang
    • Smart Media Journal
    • /
    • v.2 no.1
    • /
    • pp.31-36
    • /
    • 2013
  • A robust and efficient text extraction is very important for an accuracy of Optical Character Recognition (OCR) systems. Natural scene images with degradations such as uneven illumination, perspective distortion, complex background and multi color text give many challenges to computer vision task, especially in text extraction. In this paper, we propose a method for extraction of the text in signboard images based on a combination of mean shift algorithm and weighting scheme of hue and saturation in HSI color space for clustering algorithm. The number of clusters is determined automatically by mean shift-based density estimation, in which local clusters are estimated by repeatedly searching for higher density points in feature vector space. Weighting scheme of hue and saturation is used for formulation a new distance measure in cylindrical coordinate for text extraction. The obtained experimental results through various natural scene images are presented to demonstrate the effectiveness of our approach.

  • PDF

Improvement of the PFCM(Possibilistic Fuzzy C-Means) Clustering Method (PFCM 클러스터링 기법의 개선)

  • Heo, Gyeong-Yong;Choe, Se-Woon;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.1
    • /
    • pp.177-185
    • /
    • 2009
  • Cluster analysis or clustering is a kind of unsupervised learning method in which a set of data points is divided into a given number of homogeneous groups. Fuzzy clustering method, one of the most popular clustering method, allows a point to belong to all the clusters with different degrees, so produces more intuitive and natural clusters than hard clustering method does. Even more some of fuzzy clustering variants have noise-immunity. In this paper, we improved the Possibilistic Fuzzy C-Means (PFCM), which generates a membership matrix as well as a typicality matrix, using Gath-Geva (GG) method. The proposed method has a focus on the boundaries of clusters, which is different from most of the other methods having a focus on the centers of clusters. The generated membership values are suitable for the classification-type applications. As the typicality values generated from the algorithm have a similar distribution with the values of density function of Gaussian distribution, it is useful for Gaussian-type density estimation. Even more GG method can handle the clusters having different numbers of data points, which the other well-known method by Gustafson and Kessel can not. All of these points are obvious in the experimental results.

Automatic Clustering on Trained Self-organizing Feature Maps via Graph Cuts (그래프 컷을 이용한 학습된 자기 조직화 맵의 자동 군집화)

  • Park, An-Jin;Jung, Kee-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.572-587
    • /
    • 2008
  • The Self-organizing Feature Map(SOFM) that is one of unsupervised neural networks is a very powerful tool for data clustering and visualization in high-dimensional data sets. Although the SOFM has been applied in many engineering problems, it needs to cluster similar weights into one class on the trained SOFM as a post-processing, which is manually performed in many cases. The traditional clustering algorithms, such as t-means, on the trained SOFM however do not yield satisfactory results, especially when clusters have arbitrary shapes. This paper proposes automatic clustering on trained SOFM, which can deal with arbitrary cluster shapes and be globally optimized by graph cuts. When using the graph cuts, the graph must have two additional vertices, called terminals, and weights between the terminals and vertices of the graph are generally set based on data manually obtained by users. The Proposed method automatically sets the weights based on mode-seeking on a distance matrix. Experimental results demonstrated the effectiveness of the proposed method in texture segmentation. In the experimental results, the proposed method improved precision rates compared with previous traditional clustering algorithm, as the method can deal with arbitrary cluster shapes based on the graph-theoretic clustering.

Comparison of Blooming Artifact Reduction Using Image Segmentation Method in CT Image (CT영상에서 이미지 분할기법을 적용한 Blooming Artifact Reduction 비교 연구)

  • Kim, Jung-Hun;Park, Ji-Eun;Park, Yu-Jin;Ji, In-Hee;Lee, Jong-Min;Cho, Jin-Ho
    • Journal of Biomedical Engineering Research
    • /
    • v.38 no.6
    • /
    • pp.295-301
    • /
    • 2017
  • In this study, We subtracted the calcification blooming artifact from MDCT images of coronary atherosclerosis patients and verified their accuracy and usefulness. We performed coronary artery calcification stenosis phantom and a program to subtract calcification blooming artifact by applying 8 different image segmentation method (Otsu, Sobel, Prewitt, Canny, DoG, Region Growing, Gaussian+K-mean clustering, Otsu+DoG). As a result, In the coronary artery calcification stenosis phantom with the lumen region 5 mm the calcification blooming artifact was subtracted in the application of the mixture of Gaussian filtering and K- Clustering algorithm, and the value was close to the actual calcification region. These results may help to accurately diagnose coronary artery calcification stenosis.

Design and Implementation of Distributed In-Memory DBMS-based Parallel K-Means as In-database Analytics Function (분산 인 메모리 DBMS 기반 병렬 K-Means의 In-database 분석 함수로의 설계와 구현)

  • Kou, Heymo;Nam, Changmin;Lee, Woohyun;Lee, Yongjae;Kim, HyoungJoo
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.3
    • /
    • pp.105-112
    • /
    • 2018
  • As data size increase, a single database is not enough to serve current volume of tasks. Since data is partitioned and stored into multiple databases, analysis should also support parallelism in order to increase efficiency. However, traditional analysis requires data to be transferred out of database into nodes where analytic service is performed and user is required to know both database and analytic framework. In this paper, we propose an efficient way to perform K-means clustering algorithm inside the distributed column-based database and relational database. We also suggest an efficient way to optimize K-means algorithm within relational database.

Construction of Large Library of Protein Fragments Using Inter Alpha-carbon Distance and Binet-Cauchy Distance (내부 알파탄소간 거리와 비네-코시 거리를 사용한 대규모 단백질 조각 라이브러리 구성)

  • Chi, Sang-mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.12
    • /
    • pp.3011-3016
    • /
    • 2015
  • Representing protein three-dimensional structure by concatenating a sequence of protein fragments gives an efficient application in analysis, modeling, search, and prediction of protein structures. This paper investigated the effective combination of distance measures, which can exploit large protein structure database, in order to construct a protein fragment library representing native protein structures accurately. Clustering method was used to construct a protein fragment library. Initial clustering stage used inter alpha-carbon distance having low time complexity, and cluster extension stage used the combination of inter alpha-carbon distance, Binet-Cauchy distance, and root mean square deviation. Protein fragment library was constructed by leveraging large protein structure database using the proposed combination of distance measures. This library gives low root mean square deviation in the experiments representing protein structures with protein fragments.