• 제목/요약/키워드: Co-clustering

검색결과 221건 처리시간 0.021초

가중치 정보를 가진 연구자 네트워크 기반의 연구자 클러스터링 기법 (Researcher Clustering Technique based on Weighted Researcher Network)

  • 문현정;이상민;우용태
    • 디지털산업정보학회논문지
    • /
    • 제5권2호
    • /
    • pp.1-11
    • /
    • 2009
  • This study presents HCWS algorithm for researcher grouping on a weighted researcher network. The weights represent intensity of connections among researchers based on the number of co-authors and the number of co-authored research papers. To confirm the validity of the proposed technique, this study conducted an experimentation on about 80 research papers. As a consequence, it is proved that HCWS algorithm is able to bring about more realistic clustering compared with HCS algorithm which presents semantic relations among researchers in simple connections. In addition, it is found that HCWS algorithm can address the problems of existing HCS algorithm; researchers are disconnected since their connections are classified as weak even though they are strong, and vise versa. The technique described in this research paper can be applied to efficiently establish social networks of researchers considering relations such as collaboration histories among researchers or to create communities of researchers.

PathTalk: Interpretation of Microarray Gene-Expression Clusters in Association with Biological Pathways

  • Chung, Tae-Su;Chung, Hee-Joon;Kim, Ju-Han
    • Genomics & Informatics
    • /
    • 제5권3호
    • /
    • pp.124-128
    • /
    • 2007
  • Microarray technology enables us to measure the expression of tens of thousands of genes simultaneously under various experimental conditions. Clustering analysis is one of the most successful methods for analyzing microarray data using the assumption that co-expressed genes may be co-regulated. It is important to extract meaningful clusters from a long unordered list of clusters and to evaluate the functional homogeneity and heterogeneity of clusters. Many quality measures for clustering results have been suggested in different conditions. In the present study, we consider biological pathways as a collection of biological knowledge and used them as a reference for measuring the quality of clustering results and functional homogeneities. PathTalk visualizes and evaluates functional relationships between gene clusters and biological pathways.

제조 셀 구현을 위한 군집분석 기반 방법론 (Cluster Analysis-based Approach for Manufacturing Cell Formation)

  • 심영학;황정윤
    • 산업경영시스템학회지
    • /
    • 제36권1호
    • /
    • pp.24-35
    • /
    • 2013
  • A cell formation approach based on cluster analysis is developed for the configuration of manufacturing cells. Cell formation, which is to group machines and parts into machine cells and the associated part families, is implemented to add the flexibility and efficiency to manufacturing systems. In order to develop an efficient clustering procedure, this paper proposes a cluster analysis-based approach developed by incorporating and modifying two cluster analysis methods, a hierarchical clustering and a non-hierarchical clustering method. The objective of the proposed approach is to minimize intercellular movements and maximize the machine utilization within clusters. The proposed approach is tested on the cell formation problems and is compared with other well-known methodologies available in the literature. The result shows that the proposed approach is efficient enough to yield a good quality solution no matter what the difficulty of data sets is, ill or well-structured.

이동 물체의 상호 발생 특징정보를 이용한 동영상에서의 이동물체 추적 (Moving Object Tracking Using Co-occurrence Features of Objects)

  • Kim, Seongdong;Seongah Chin;Moonwon Choo
    • 지능정보연구
    • /
    • 제8권2호
    • /
    • pp.1-13
    • /
    • 2002
  • 본 논문에서는 연속적으로 입력되는 칼라영상에서 물체의 이동에 의하여 형성된 동작영역을 확인하고, 영상의 시컨스(sequence)를 대상으로 움직이는 물체의 형태인 보행자 혹은 자동차들의 이동방향을 추적하는 시스템을 제안하였다. 카메라가 고정되어 있고 물체가 이동하는 상황에서 카메라시계에 진입하는 물체를 포착하여, 포착된 물체의 영역을 차 영상 분석을 통해 이진화하여 추출하고, 추출된 영역을 co-occurrence matrix의 RGB full 칼라의 특징 벡터를 추출하는 것을 제시하였다 추출되어지는 칼라 특징벡터를 분석하여 인접 프레임간의 이동물체 영역끼리의 대응관계를 조사함으로서, 이동물체를 추적한다. 군집화(clustering) 단계에서는 이전 단계에서 추출한 특징 벡터들 가운데 에너지, 엔트로피만을 가지고 인접 프레임간의 군집화를 조사하기 위하여 이동물체 영역들 간의 퍼지동적물체 정합 알고리즘을 적용시켰다. 인접 프레임간의 움직임 영역의 물체들에 대하여 멤버 쉽 함수를 근거로 중심 값을 계산하면, 동일 물체일 경우 중심 값 부근에서 군집이 형성되며, 이를 바탕으로 이동물체를 추출할 수 있는 방안을 제안하였다.

  • PDF

Nearest neighbor and validity-based clustering

  • Son, Seo H.;Seo, Suk T.;Kwon, Soon H.
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제4권3호
    • /
    • pp.337-340
    • /
    • 2004
  • The clustering problem can be formulated as the problem to find the number of clusters and a partition matrix from a given data set using the iterative or non-iterative algorithms. The author proposes a nearest neighbor and validity-based clustering algorithm where each data point in the data set is linked with the nearest neighbor data point to form initial clusters and then a cluster in the initial clusters is linked with the nearest neighbor cluster to form a new cluster. The linking between clusters is continued until no more linking is possible. An optimal set of clusters is identified by using the conventional cluster validity index. Experimental results on well-known data sets are provided to show the effectiveness of the proposed clustering algorithm.

Projection Pursuit K-Means Visual Clustering

  • Kim, Mi-Kyung;Huh, Myung-Hoe
    • Journal of the Korean Statistical Society
    • /
    • 제31권4호
    • /
    • pp.519-532
    • /
    • 2002
  • K-means clustering is a well-known partitioning method of multivariate observations. Recently, the method is implemented broadly in data mining softwares due to its computational efficiency in handling large data sets. However, it does not yield a suitable visual display of multivariate observations that is important especially in exploratory stage of data analysis. The aim of this study is to develop a K-means clustering method that enables visual display of multivariate observations in a low-dimensional space, for which the projection pursuit method is adopted. We propose a computationally inexpensive and reliable algorithm and provide two numerical examples.

Reorganizing Social Issues from R&D Perspective Using Social Network Analysis

  • Shun Wong, William Xiu;Kim, Namgyu
    • Journal of Information Technology Applications and Management
    • /
    • 제22권3호
    • /
    • pp.83-103
    • /
    • 2015
  • The rapid development of internet technologies and social media over the last few years has generated a huge amount of unstructured text data, which contains a great deal of valuable information and issues. Therefore, text mining-extracting meaningful information from unstructured text data-has gained attention from many researchers in various fields. Topic analysis is a text mining application that is used to determine the main issues in a large volume of text documents. However, it is difficult to identify related issues or meaningful insights as the number of issues derived through topic analysis is too large. Furthermore, traditional issue-clustering methods can only be performed based on the co-occurrence frequency of issue keywords in many documents. Therefore, an association between issues that have a low co-occurrence frequency cannot be recognized using traditional issue-clustering methods, even if those issues are strongly related in other perspectives. Therefore, in this research, a methodology to reorganize social issues from a research and development (R&D) perspective using social network analysis is proposed. Using an R&D perspective lexicon, issues that consistently share the same R&D keywords can be further identified through social network analysis. In this study, the R&D keywords that are associated with a particular issue imply the key technology elements that are needed to solve a particular issue. Issue clustering can then be performed based on the analysis results. Furthermore, the relationship between issues that share the same R&D keywords can be reorganized more systematically, by grouping them into clusters according to the R&D perspective lexicon. We expect that our methodology will contribute to establishing efficient R&D investment policies at the national level by enhancing the reusability of R&D knowledge, based on issue clustering using the R&D perspective lexicon. In addition, business companies could also utilize the results by aligning the R&D with their business strategy plans, to help companies develop innovative products and new technologies that sustain innovative business models.

Clustering Based Adaptive Power Control for Interference Mitigation in Two-Tier Femtocell Networks

  • Wang, Hong;Song, Rongfang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권4호
    • /
    • pp.1424-1441
    • /
    • 2014
  • Two-tier femtocell networks, consisting of a conventional cellular network underlaid with femtocell hotspots, play an important role in the indoor coverage and capacity of cellular networks. However, the cross- and co-tier interference will cause an unacceptable quality of service (QoS) for users with universal frequency reuse. In this paper, we propose a novel downlink interference mitigation strategy for spectrum-shared two-tier femtocell networks. The proposed solution is composed of three parts. The first is femtocells clustering, which maximizes the distance between femtocells using the same slot resource to mitigate co-tier interference. The second is to assign macrocell users (MUEs) to clusters by max-min criterion, by which each MUE can avoid using the same resource as the nearest femtocell. The third is a novel adaptive power control scheme with femtocells downlink transmit power adjusted adaptively based on the signal to interference plus noise ratio (SINR) level of neighboring users. Simulation results show that the proposed scheme can effectively increase the successful transmission ratio and ergodic capacity of femtocells, while guaranteeing QoS of the macrocell.

다학제 분야 학술지의 주제어 동시발생 네트워크를 활용한 기술예측 연구 (A Study on Technology Forecasting based on Co-occurrence Network of Keyword in Multidisciplinary Journals)

  • 김현욱;안상진;정우성
    • 한국경영과학회지
    • /
    • 제40권4호
    • /
    • pp.49-63
    • /
    • 2015
  • Keyword indexed in multidisciplinary journals show trends about science and technology innovation. Nature and Science were selected as multidisciplinary journals for our analysis. In order to reduce the effect of plurality of keyword, stemming algorithm were implemented. After this process, we fitted growth curve of keyword (stem) following bass model, which is a well-known model in diffusion process. Bass model is useful for expressing growth pattern by assuming innovative and imitative activities in innovation spreading. In addition, we construct keyword co-occurrence network and calculate network measures such as centrality indices and local clustering coefficient. Based on network metrics and yearly frequency of keyword, time series analysis was conducted for obtaining statistical causality between these measures. For some cases, local clustering coefficient seems to Granger-cause yearly frequency of keyword. We expect that local clustering coefficient could be a supportive indicator of emerging science and technology.

Empirical Comparison of Word Similarity Measures Based on Co-Occurrence, Context, and a Vector Space Model

  • Kadowaki, Natsuki;Kishida, Kazuaki
    • Journal of Information Science Theory and Practice
    • /
    • 제8권2호
    • /
    • pp.6-17
    • /
    • 2020
  • Word similarity is often measured to enhance system performance in the information retrieval field and other related areas. This paper reports on an experimental comparison of values for word similarity measures that were computed based on 50 intentionally selected words from a Reuters corpus. There were three targets, including (1) co-occurrence-based similarity measures (for which a co-occurrence frequency is counted as the number of documents or sentences), (2) context-based distributional similarity measures obtained from a latent Dirichlet allocation (LDA), nonnegative matrix factorization (NMF), and Word2Vec algorithm, and (3) similarity measures computed from the tf-idf weights of each word according to a vector space model (VSM). Here, a Pearson correlation coefficient for a pair of VSM-based similarity measures and co-occurrence-based similarity measures according to the number of documents was highest. Group-average agglomerative hierarchical clustering was also applied to similarity matrices computed by individual measures. An evaluation of the cluster sets according to an answer set revealed that VSM- and LDA-based similarity measures performed best.