• Title/Summary/Keyword: and clustering

Search Result 5,621, Processing Time 0.032 seconds

Usability Analysis of Structured Abstracts in Journal Articles for Document Clustering (문서 클러스터링을 위한 학술지 논문의 구조적 초록 활용성 연구)

  • Choi, Sang-Hee;Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.1
    • /
    • pp.331-349
    • /
    • 2012
  • Structured abstracts have been regarded as an essential information factor to represent topics of journal articles. This study aims to provide an unconventional view to utilize structured abstracts with the analysis on sub fields of a structured abstract in depth. In this study, a structured abstract was segmented into four fields, namely, purpose, design, findings, and values/implications. Each field was compared in the performance analysis of document clustering. In result, the purpose statement of an abstract affected on the performance of journal article clustering more than any other fields. Furthermore, certain types of keywords were identified to be excluded in the document clustering to improve clustering performance, especially by Within group average clustering method. These keywords had stronger relationship to a specific abstract field such as research design than the topic of an article.

Colorectal Cancer Staging Using Three Clustering Methods Based on Preoperative Clinical Findings

  • Pourahmad, Saeedeh;Pourhashemi, Soudabeh;Mohammadianpanah, Mohammad
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.2
    • /
    • pp.823-827
    • /
    • 2016
  • Determination of the colorectal cancer stage is possible only after surgery based on pathology results. However, sometimes this may prove impossible. The aim of the present study was to determine colorectal cancer stage using three clustering methods based on preoperative clinical findings. All patients referred to the Colorectal Research Center of Shiraz University of Medical Sciences for colorectal cancer surgery during 2006 to 2014 were enrolled in the study. Accordingly, 117 cases participated. Three clustering algorithms were utilized including k-means, hierarchical and fuzzy c-means clustering methods. External validity measures such as sensitivity, specificity and accuracy were used for evaluation of the methods. The results revealed maximum accuracy and sensitivity values for the hierarchical and a maximum specificity value for the fuzzy c-means clustering methods. Furthermore, according to the internal validity measures for the present data set, the optimal number of clusters was two (silhouette coefficient) and the fuzzy c-means algorithm was more appropriate than the k-means clustering approach by increasing the number of clusters.

LVQ_Merge Clustering Algorithm for Cell Image Extraction (세포 영상 추출을 위한 LVQ_Merge 군집화 알고리즘)

  • Kwon, Hee Yong;Kim, Min Su;Choi, Kyung Wan;Kwack, Ho Jic;Yu, Suk Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.6
    • /
    • pp.845-852
    • /
    • 2017
  • In this paper, we propose a binarization algorithm using LVQ-Merge clustering method for fast and accurate extraction of cells from cell images. The proposed method clusters pixel data of a given image by using LVQ to remove noise and divides the result into two clusters by applying a hierarchical clustering algorithm to improve the accuracy of binarization. As a result, the execution speed is somewhat slower than that of the conventional LVQ or Otsu algorithm. However, the results of the binarization have very good quality and are almost identical to those judged by the human eye. Especially, the bigger and the more complex the image, the better the binarization quality. This suggests that the proposed method is a useful method for medical image processing field where high-resolution and huge medical images must be processed in real time. In addition, this method is possible to have many clusters instead of two cluster, so it can be used as a method to complement a hierarchical clustering algorithm.

The Clustering Scheme for Load-Balancing in Mobile Ad-hoc Network (이동 애드혹 네트워크에서 로드 밸런싱을 위한 클러스터링 기법)

  • Lim, Won-Taek;Kim, Gu-Su;Kim, Moon-Jeong;Eom, Young-Ik
    • The KIPS Transactions:PartC
    • /
    • v.13C no.6 s.109
    • /
    • pp.757-766
    • /
    • 2006
  • Mobile Ad-hoc Network(MANET) is an autonomous network consisted of mobile hosts. A considerable number of studies have been conducted on the MANET with studies of ubiquitous computing. Several studies have been made on the clustering schemes which manage network hierarchically to Improve flat architecture of MANET. But the conventional schemes have the lack of multi-hop clustering and load balancing. This paper proposes a clustering scheme to support multi-hop clustering and to consider load balancing between cluster heads. We define the split of clusters and states of cluster, and propose join, merge, divide, and election of cluster head schemes for load balancing of between cluster heads

Clustering of Web Document Exploiting with the Co-link in Hypertext (동시링크를 이용한 웹 문서 클러스터링 실험)

  • 김영기;이원희;권혁철
    • Journal of Korean Library and Information Science Society
    • /
    • v.34 no.2
    • /
    • pp.233-253
    • /
    • 2003
  • Knowledge organization is the way we humans understand the world. There are two types of information organization mechanisms studied in information retrieval: namely classification md clustering. Classification organizes entities by pigeonholing them into predefined categories, whereas clustering organizes information by grouping similar or related entities together. The system of the Internet information resources extracts a keyword from the words which appear in the web document and draws up a reverse file. Term clustering based on grouping related terms, however, did not prove overly successful and was mostly abandoned in cases of documents used different languages each other or door-way-pages composed of only an anchor text. This study examines infometric analysis and clustering possibility of web documents based on co-link topology of web pages.

  • PDF

Applying Particle Swarm Optimization for Enhanced Clustering of DNA Chip Data (DNA Chip 데이터의 군집화 성능 향상을 위한 Particle Swarm Optimization 알고리즘의 적용기법)

  • Lee, Min-Soo
    • The KIPS Transactions:PartD
    • /
    • v.17D no.3
    • /
    • pp.175-184
    • /
    • 2010
  • Experiments and research on genes have become very convenient by using DNA chips, which provide large amounts of data from various experiments. The data provided by the DNA chips could be represented as a two dimensional matrix, in which one axis represents genes and the other represents samples. By performing an efficient and good quality clustering on such data, the classification work which follows could be more efficient and accurate. In this paper, we use a bio-inspired algorithm called the Particle Swarm Optimization algorithm to propose an efficient clustering mechanism for large amounts of DNA chip data, and show through experimental results that the clustering technique using the PSO algorithm provides a faster yet good quality result compared with other existing clustering solutions.

Energy-Efficient Cluster Head Selection Method in Wireless Sensor Networks (무선 센서 네트워크에서 에너지 효율적 클러스터 헤드 선정 기법)

  • Nam, Choon-Sung;Jang, Kyung-Soo;Shin, Ho-Jin;Shin, Dong-Ryeol
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.2
    • /
    • pp.25-30
    • /
    • 2010
  • Wireless sensor networks is composed of many similar sensor nodes with limited resources. They are randomly scattered over a specific area and self-organize the network. For guarantee of network life time, load balancing and scalability in sensor networks, sensor networks needs the clustering algorithm which distribute the networks to a local cluster. In existing clustering algorithms, the cluster head selection method has two problems. One is additional communication cost for finding location and energy of nodes. Another is unequal clustering. To solve them, this paper proposes a novel cluster head selection algorithm revised previous clustering algorithm, LEACH. The simulation results show that the energy compared with the previous clustering method is reduced.

Document Clustering using Clustering and Wikipedi (군집과 위키피디아를 이용한 문서군집)

  • Park, Sun;Lee, Seong Ho;Park, Hee Man;Kim, Won Ju;Kim, Dong Jin;Chandra, Abel;Lee, Seong Ro
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.392-393
    • /
    • 2012
  • This paper proposes a new document clustering method using clustering and Wikipedia. The proposed method can well represent the concept of cluster topics by means of NMF. It can solve the problem of "bags of words" to be not considered the meaningful relationships between documents and clusters, which expands the important terms of cluster by using of the synonyms of Wikipedia. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

  • PDF

Feature Weighting in Projected Clustering for High Dimensional Data (고차원 데이타에 대한 투영 클러스터링에서 특성 가중치 부여)

  • Park, Jong-Soo
    • Journal of KIISE:Databases
    • /
    • v.32 no.3
    • /
    • pp.228-242
    • /
    • 2005
  • The projected clustering seeks to find clusters in different subspaces within a high dimensional dataset. We propose an algorithm to discover near optimal projected clusters without user specified parameters such as the number of output clusters and the average cardinality of subspaces of projected clusters. The objective function of the algorithm computes projected energy, quality, and the number of outliers in each process of clustering. In order to minimize the projected energy and to maximize the quality in clustering, we start to find best subspace of each cluster on the density of input points by comparing standard deviations of the full dimension. The weighting factor for each dimension of the subspace is used to get id of probable error in measuring projected distances. Our extensive experiments show that our algorithm discovers projected clusters accurately and it is scalable to large volume of data sets.

Clustering Algorithm Considering Sensor Node Distribution in Wireless Sensor Networks

  • Yu, Boseon;Choi, Wonik;Lee, Taikjin;Kim, Hyunduk
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.926-940
    • /
    • 2018
  • In clustering-based approaches, cluster heads closer to the sink are usually burdened with much more relay traffic and thus, tend to die early. To address this problem, distance-aware clustering approaches, such as energy-efficient unequal clustering (EEUC), that adjust the cluster size according to the distance between the sink and each cluster head have been proposed. However, the network lifetime of such approaches is highly dependent on the distribution of the sensor nodes, because, in randomly distributed sensor networks, the approaches do not guarantee that the cluster energy consumption will be proportional to the cluster size. To address this problem, we propose a novel approach called CACD (Clustering Algorithm Considering node Distribution), which is not only distance-aware but also node density-aware approach. In CACD, clusters are allowed to have limited member nodes, which are determined by the distance between the sink and the cluster head. Simulation results show that CACD is 20%-50% more energy-efficient than previous work under various operational conditions considering the network lifetime.