• Title/Summary/Keyword: Optimal Clustering

Search Result 367, Processing Time 0.024 seconds

Application of Genetic and Local Optimization Algorithms for Object Clustering Problem with Similarity Coefficients (유사성 계수를 이용한 군집화 문제에서 유전자와 국부 최적화 알고리듬의 적용)

  • Yim, Dong-Soon;Oh, Hyun-Seung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.29 no.1
    • /
    • pp.90-99
    • /
    • 2003
  • Object clustering, which makes classification for a set of objects into a number of groups such that objects included in a group have similar characteristic and objects in different groups have dissimilar characteristic each other, has been exploited in diverse area such as information retrieval, data mining, group technology, etc. In this study, an object-clustering problem with similarity coefficients between objects is considered. At first, an evaluation function for the optimization problem is defined. Then, a genetic algorithm and local optimization technique based on heuristic method are proposed and used in order to obtain near optimal solutions. Solutions from the genetic algorithm are improved by local optimization techniques based on object relocation and cluster merging. Throughout extensive experiments, the validity and effectiveness of the proposed algorithms are tested.

Identification of Regression Outliers Based on Clustering of LMS-residual Plots

  • Kim, Bu-Yong;Oh, Mi-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.3
    • /
    • pp.485-494
    • /
    • 2004
  • An algorithm is proposed to identify multiple outliers in linear regression. It is based on the clustering of residuals from the least median of squares estimation. A cut-height criterion for the hierarchical cluster tree is suggested, which yields the optimal clustering of the regression outliers. Comparisons of the effectiveness of the procedures are performed on the basis of the classic data and artificial data sets, and it is shown that the proposed algorithm is superior to the one that is based on the least squares estimation. In particular, the algorithm deals very well with the masking and swamping effects while the other does not.

Colorectal Cancer Staging Using Three Clustering Methods Based on Preoperative Clinical Findings

  • Pourahmad, Saeedeh;Pourhashemi, Soudabeh;Mohammadianpanah, Mohammad
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.2
    • /
    • pp.823-827
    • /
    • 2016
  • Determination of the colorectal cancer stage is possible only after surgery based on pathology results. However, sometimes this may prove impossible. The aim of the present study was to determine colorectal cancer stage using three clustering methods based on preoperative clinical findings. All patients referred to the Colorectal Research Center of Shiraz University of Medical Sciences for colorectal cancer surgery during 2006 to 2014 were enrolled in the study. Accordingly, 117 cases participated. Three clustering algorithms were utilized including k-means, hierarchical and fuzzy c-means clustering methods. External validity measures such as sensitivity, specificity and accuracy were used for evaluation of the methods. The results revealed maximum accuracy and sensitivity values for the hierarchical and a maximum specificity value for the fuzzy c-means clustering methods. Furthermore, according to the internal validity measures for the present data set, the optimal number of clusters was two (silhouette coefficient) and the fuzzy c-means algorithm was more appropriate than the k-means clustering approach by increasing the number of clusters.

Mobile Base Station Placement with BIRCH Clustering Algorithm for HAP Network (HAP 네트워크에서 BIRCH 클러스터링 알고리즘을 이용한 이동 기지국의 배치)

  • Chae, Jun-Byung;Song, Ha-Yoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.10
    • /
    • pp.761-765
    • /
    • 2009
  • This research aims an optimal placement of Mobile Base Station (MBS) under HAP based network configurations with the restrictions of HAP capabilities. With clustering algorithm based on BIRCH, mobile ground nodes are clustered and the centroid of the clusters will be the location of MBS. The hierarchical structure of BIRCH enables mobile node management by CF tree and the restrictions of maximum nodes per MBS and maximum radio coverage are accomplished by splitting and merging clusters. Mobility models based on Jeju island are used for simulations and such restrictions are met with proper placement of MBS.

Feature Weighting in Projected Clustering for High Dimensional Data (고차원 데이타에 대한 투영 클러스터링에서 특성 가중치 부여)

  • Park, Jong-Soo
    • Journal of KIISE:Databases
    • /
    • v.32 no.3
    • /
    • pp.228-242
    • /
    • 2005
  • The projected clustering seeks to find clusters in different subspaces within a high dimensional dataset. We propose an algorithm to discover near optimal projected clusters without user specified parameters such as the number of output clusters and the average cardinality of subspaces of projected clusters. The objective function of the algorithm computes projected energy, quality, and the number of outliers in each process of clustering. In order to minimize the projected energy and to maximize the quality in clustering, we start to find best subspace of each cluster on the density of input points by comparing standard deviations of the full dimension. The weighting factor for each dimension of the subspace is used to get id of probable error in measuring projected distances. Our extensive experiments show that our algorithm discovers projected clusters accurately and it is scalable to large volume of data sets.

The Document Clustering using Multi-Objective Genetic Algorithms (다목적 유전자 알고리즘을 이용한문서 클러스터링)

  • Lee, Jung-Song;Park, Soon-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.2
    • /
    • pp.57-64
    • /
    • 2012
  • In this paper, the multi-objective genetic algorithm is proposed for the document clustering which is important in the text mining field. The most important function in the document clustering algorithm is to group the similar documents in a corpus. So far, the k-means clustering and genetic algorithms are much in progress in this field. However, the k-means clustering depends too much on the initial centroid, the genetic algorithm has the disadvantage of coming off in the local optimal value easily according to the fitness function. In this paper, the multi-objective genetic algorithm is applied to the document clustering in order to complement these disadvantages while its accuracy is analyzed and compared to the existing algorithms. In our experimental results, the multi-objective genetic algorithm introduced in this paper shows the accuracy improvement which is superior to the k-means clustering(about 20 %) and the general genetic algorithm (about 17 %) for the document clustering.

Optimal Parameter Analysis and Evaluation of Change Detection for SLIC-based Superpixel Techniques Using KOMPSAT Data (KOMPSAT 영상을 활용한 SLIC 계열 Superpixel 기법의 최적 파라미터 분석 및 변화 탐지 성능 비교)

  • Chung, Minkyung;Han, Youkyung;Choi, Jaewan;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_3
    • /
    • pp.1427-1443
    • /
    • 2018
  • Object-based image analysis (OBIA) allows higher computation efficiency and usability of information inherent in the image, as it reduces the complexity of the image while maintaining the image properties. Superpixel methods oversegment the image with a smaller image unit than an ordinary object segment and well preserve the edges of the image. SLIC (Simple linear iterative clustering) is known for outperforming the previous superpixel methods with high image segmentation quality. Although the input parameter for SLIC, number of superpixels has considerable influence on image segmentation results, impact analysis for SLIC parameter has not been investigated enough. In this study, we performed optimal parameter analysis and evaluation of change detection for SLIC-based superpixel techniques using KOMPSAT data. Forsuperpixel generation, three superpixel methods (SLIC; SLIC0, zero parameter version of SLIC; SNIC, simple non-iterative clustering) were used with superpixel sizes in ranges of $5{\times}5$ (pixels) to $50{\times}50$ (pixels). Then, the image segmentation results were analyzed for how well they preserve the edges of the change detection reference data. Based on the optimal parameter analysis, image segmentation boundaries were obtained from difference image of the bi-temporal images. Then, DBSCAN (Density-based spatial clustering of applications with noise) was applied to cluster the superpixels to a certain size of objects for change detection. The changes of features were detected for each superpixel and compared with reference data for evaluation. From the change detection results, it proved that better change detection can be achieved even with bigger superpixel size if the superpixels were generated with high regularity of size and shape.

An Experimental Study on Multi-Document Summarization for Question Answering (질의응답을 위한 복수문서 요약에 관한 실험적 연구)

  • Choi, Sang-Hee;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.3
    • /
    • pp.289-303
    • /
    • 2004
  • This experimental study proposes a multi-document summarization method that produces optimal summaries in which users can find answers to their queries. In order to identify the most effective method for this purpose, the performance of the three summarization methods were compared. The investigated methods are sentence clustering, passage extraction through spreading activation, and clustering-passage extraction hybrid methods. The effectiveness of each summarizing method was evaluated by two criteria used to measure the accuracy and the redundancy of a summary. The passage extraction method using the sequential bnb search algorithm proved to be most effective in summarizing multiple documents with regard to summarization precision. This study proposes the passage extraction method as the optimal multi-document summarization method.

Optimal design of water distribution system using modified hybrid vision correction algorithm (Modified hybrid vision correction algorithm을 활용한 상수관망 최적설계)

  • Ryu, Yong Min;Lee, Eui Hoon
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.spc1
    • /
    • pp.1271-1282
    • /
    • 2022
  • The optimal design of Water Distribution System (WDS) is used in various ways according to the purpose set by the user. The optimal design of WDS has various purposes, such as minimizing costs and minimizing energy generated when manufacturing pipes. In this study, based on the Modified Hybrid Vision Correction Algorithm (MHVCA), a cost-optimal design was conducted for various WDSs. We also propose a new evaluation index, Best Rate (BR). BR is an evaluation index developed based on the K-mean Clustering Algorithm. Through BR, a comparison was made on the possibility of searching for the optimal design of each algorithm used in the optimal design of WDS. The results of MHVCA for WDS were compared with Vision Correction Algorithm (VCA) and Hybrid Vision Correction Algorithm (HVCA). MHVCA showed a lower cost design than VCA and HVCA. In addition, MHVCA showed better probability of lower cost designs than VCA and HVCA. MHVCA will be able to show good results when applied to the optimal design of WDS for various purposes as well as the optimal design of WDS for cost minimization applied in this study.

A Fuzzy C Elliptic Shells Clustering

  • 김대진
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.4
    • /
    • pp.18-22
    • /
    • 1998
  • This paper presents a fuzzy c elliptic shells algorithm that detects culusters than can be expressed by hyperellipsoidal shells. The algorithm is computationally efficient since the prototyes of shell clusters are determined by a simple matrix inversion instead of by solving several nonlinear equations. The algorithm also works when the detected shells are partial the optimal number of clusters is unkonown initially. A set of simulation results validates the proposed clustering mehtod.

  • PDF