• Title/Summary/Keyword: hierarchic clustering

Search Result 6, Processing Time 0.018 seconds

Hierarchic Document Clustering in OPAC (OPAC에서 자동분류 열람을 위한 계층 클러스터링 연구)

  • 노정순
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.1
    • /
    • pp.93-117
    • /
    • 2004
  • This study is to develop a hierarchic clustering model fur document classification and browsing in OPAC systems. Two automatic indexing techniques (with and without controlled terms), two term weighting methods (based on term frequency and binary weight), five similarity coefficients (Dice, Jaccard, Pearson, Cosine, and Squared Euclidean). and three hierarchic clustering algorithms (Between Average Linkage, Within Average Linkage, and Complete Linkage method) were tested on the document collection of 175 books and theses on library and information science. The best document clusters resulted from the Between Average Linkage or Complete Linkage method with Jaccard or Dice coefficient on the automatic indexing with controlled terms in binary vector. The clusters from Between Average Linkage with Jaccard has more likely decimal classification structure.

The Effectiveness of Hierarchic Clustering on Query Results in OPAC (OPAC에서 탐색결과의 클러스터링에 관한 연구)

  • Ro, Jung-Soon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.38 no.1
    • /
    • pp.35-50
    • /
    • 2004
  • This study evaluated the applicability of the static hierarchic clustering model to clustering query results in OPAC. Two clustering methods(Between Average Linkage(BAL) and Complete Linkage(CL)) and two similarity coefficients(Dice and Jaccard) were tested on the query results retrieved from 16 title-based keyword searchings. The precision of optimal dusters was improved more than 100% compared with title-word searching. There was no difference between similarity coefficients but clustering methods in optimal cluster effectiveness. CL method is better in precision ratio but BAL is better in recall ratio at the optimal top-level and bottom-level clusters. However the differences are not significant except higher recall ratio of BAL at the top-level duster. Small number of clusters and long chain of hierarchy for optimal cluster resulted from BAL could not be desirable and efficient.

A Study of Designing the Automatic Information Retrieval System based on Natural Language (자연어를 이용한 자동정보검색시스템 구축에 관한 연구)

  • Seo, Hwi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.35 no.4
    • /
    • pp.141-160
    • /
    • 2001
  • This study is to develop a new system for conducting the information retrieval automatically. The system in this study is programmed by Delphi 4.0(PASCAL) and consists of automatic indexing, clustering technique, establishing and expressing term hierarchic relation, and automatic information retrieval technique. Thus this browser system can automatically control all the processes of information searching such as representation, generation and extension of queries and construction of searching strategy and feedback searching.

  • PDF

Intelligent Clustering Mechanism for Efficient Energy Management in Sensor Network (센서 네트워크에서의 효율적 에너지 관리를 위한 지능형 클러스터링 기법)

  • Seo, Sung-Yun;Jung, Won-Soo;Oh, Young-Hwan
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.44 no.4
    • /
    • pp.40-48
    • /
    • 2007
  • MANET constructs a network that is free and independent between sensor nodes without infrastructure. Also, there are a lot of difficulties to manage data process, control etc.. back efficiently from change of topology by transfer of sensor node that compose network. Especially, because each sensor node must consider mobility certainly, problem about energy use happens. To solve these problem, mechanisms that compose cluster of cluster header and hierarchic structure between member were suggested. However, accompanies inefficient energy consumption because sensing power level of sensor node is fixed and brings energy imbalance of sensor network and shortening of survival time. In this paper, I suggested intelligent clustering mechanism for efficient energy management to solve these problem of existent Clustering mechanism. Proposed mechanism corresponds fast in network topology change by transfer of sensor node, and compares in existent mechanism in circumstance that require serial sensing and brings elevation survival time of sensor node.Please put the abstract of paper here.

A DB Pruning Method in a Large Corpus-Based TTS with Multiple Candidate Speech Segments (대용량 복수후보 TTS 방식에서 합성용 DB의 감량 방법)

  • Lee, Jung-Chul;Kang, Tae-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.6
    • /
    • pp.572-577
    • /
    • 2009
  • Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. To prune the redundant speech segments in a large speech segment DB, we can utilize a decision-tree based triphone clustering algorithm widely used in speech recognition area. But, the conventional methods have problems in representing the acoustic transitional characteristics of the phones and in applying context questions with hierarchic priority. In this paper, we propose a new clustering algorithm to downsize the speech DB. Firstly, three 13th order MFCC vectors from first, medial, and final frame of a phone are combined into a 39 dimensional vector to represent the transitional characteristics of a phone. And then the hierarchically grouped three question sets are used to construct the triphone trees. For the performance test, we used DTW algorithm to calculate the acoustic similarity between the target triphone and the triphone from the tree search result. Experimental results show that the proposed method can reduce the size of speech DB by 23% and select better phones with higher acoustic similarity. Therefore the proposed method can be applied to make a small sized TTS.