• Title/Summary/Keyword: Automatic Clustering

Search Result 242, Processing Time 0.026 seconds

A Study of Designing the Automatic Information Retrieval System based on Natural Language (자연어를 이용한 자동정보검색시스템 구축에 관한 연구)

  • Seo, Hwi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.35 no.4
    • /
    • pp.141-160
    • /
    • 2001
  • This study is to develop a new system for conducting the information retrieval automatically. The system in this study is programmed by Delphi 4.0(PASCAL) and consists of automatic indexing, clustering technique, establishing and expressing term hierarchic relation, and automatic information retrieval technique. Thus this browser system can automatically control all the processes of information searching such as representation, generation and extension of queries and construction of searching strategy and feedback searching.

  • PDF

A Study of Designing the Han-Guel Thesaurus Browser for Automatic Information Retrieval (자동정보검색을 위한 한글 시소러스 브라우저 구축에 관한 연구)

  • Seo, Whee
    • Journal of Korean Library and Information Science Society
    • /
    • v.31 no.2
    • /
    • pp.279-302
    • /
    • 2000
  • This study is to develop a new automatic system for the Korean thesaurus browser by which we can automatically control all the processes of searching queries such as, representation, generation, extension and construction of searching strategy and feedback searching. The system in this study is programmed by Delphi 4.0(PASCAL) and consists of database system, automatic indexing, clustering technique, establishing and expressing thesaurus, and automatic information retrieval technique. The results proved by this system are as follows: 1)By using the new automatic thesaurus browser developed by the new algorithm, we can perform information retrieval, automatic indexing, clustering technique, establishing and expressing thesaurus, information retrieval technique, and retrieval feedback. Thus it turns out that even the beginner user can easily access special terms about the field of a specific subject. 2) The thesaurus browser in this paper has such merits as the easiness of establishing, the convenience of using, and the good results of information retrieval in terms of the rate of speed, degree, and regeneration. Thus, it t m out very pragmatic.

  • PDF

Nucleus Recognition of Uterine Cervical Pap-Smears using FCM Clustering Algorithm

  • Kim, Kwang-Baek
    • Journal of information and communication convergence engineering
    • /
    • v.6 no.1
    • /
    • pp.94-99
    • /
    • 2008
  • Segmentation for the region of nucleus in the image of uterine cervical cytodiagnosis is known as the most difficult and important part in the automatic cervical cancer recognition system. In this paper, the region of nucleus is extracted from an image of uterine cervical cytodiagnosis using the HSI model. The characteristics of the nucleus are extracted from the analysis of morphemetric features, densitometric features, colormetric features, and textural features based on the detected region of nucleus area. The classification criterion of a nucleus is defined according to the standard categories of the Bethesda system. The fuzzy C-means clustering algorithm is employed to the extracted nucleus and the results show that the proposed method is efficient in nucleus recognition and uterine cervical Pap-Smears extraction.

Automatic Extraction of Blood Flow Area in Brachial Artery for Suspicious Hypertension Patients from Color Doppler Sonography with Fuzzy C-Means Clustering

  • Kim, Kwang Baek;Song, Doo Heon;Yun, Sang-Seok
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.4
    • /
    • pp.258-263
    • /
    • 2018
  • Color Doppler sonography is a useful tool for examining blood flow and related indices. However, it should be done by well-trained operator, that is, operator subjectivity exists. In this paper, we propose an automatic blood flow area extraction method from brachial artery that would be an essential building block of computer aided color Doppler analyzer. Specifically, our concern is to examine hypertension suspicious (prehypertension) patients who might develop their symptoms to established hypertension in the future. The proposed method uses fuzzy C-means clustering as quantization engine with careful seeding of the number of clusters from histogram analysis. The experiment verifies that the proposed method is feasible in that the successful extraction rates are 96% (successful in 48 out of 50 test cases) and demonstrated better performance than K-means based method in specificity and sensitivity analysis but the proposed method should be further refined as the retrospective analysis pointed out.

A Theoretical Study of Designing Thesaurus Browser by Clustering Algorithm (클러스터링을 이용한 시소러스 브라우저의 설계에 대한 이론적 연구)

  • Seo, Hwi
    • Journal of Korean Library and Information Science Society
    • /
    • v.30 no.3
    • /
    • pp.427-456
    • /
    • 1999
  • This paper deals with the problems of information retrieval through full-test database which arise from both the deficiency of searching strategies or methods by information searcher and the difficulties of query representation, generation, extension, etc. In oder to solve these problems, we should use automatic retrieval instead of manual retrieval in the past. One of the ways to make the gap narrow between the terms by the writers and query by the searchers is that the query should be searched with the terms which the writers use. Thus, the preconditions which should be taken one accorded way to solve the problems are that all areas of information retrieval such as should taken one accorded way to solve the problems are that all areas of information retrieval such as contents analysis, information structure, query formation, query evaluation, etc. should be solved as a coherence way. We need to deal all the ares of automatic information retrieval for the efficiency of retrieval thought this paper is trying to solve the design of thesaurus browser. Thus, this paper shows the theoretical analyses about the form of information retrieval, automatic indexing, clustering technique, establishing and expressing thesaurus, and information retrieval technique. As the result of analyzing them, this paper shows us theoretical model, that is to say, the thesaurus browser by clustering algorithm. The result in the paper will be a theoretical basis on new retrieval algorithm.

  • PDF

Repeated Clustering to Improve the Discrimination of Typical Daily Load Profile

  • Kim, Young-Il;Ko, Jong-Min;Song, Jae-Ju;Choi, Hoon
    • Journal of Electrical Engineering and Technology
    • /
    • v.7 no.3
    • /
    • pp.281-287
    • /
    • 2012
  • The customer load profile clustering method is used to make the TDLP (Typical Daily Load Profile) to estimate the quarter hourly load profile of non-AMR (Automatic Meter Reading) customers. This study examines how the repeated clustering method improves the ability to discriminate among the TDLPs of each cluster. The k-means algorithm is a well-known clustering technology in data mining. Repeated clustering groups the cluster into sub-clusters with the k-means algorithm and chooses the sub-cluster that has the maximum average error and repeats clustering until the final cluster count is satisfied.

A Theoretical Study on Indexing Methods using the Metadata for the Automatic Construction of a Thesaurus Browser (시소러스 브라우저 자동구현을 위한 Metadata를 이용한 색인어 처리방안에 대한 연구)

  • Seo , Whee
    • Journal of Korean Library and Information Science Society
    • /
    • v.35 no.4
    • /
    • pp.451-467
    • /
    • 2004
  • This paper is intended to present the theoretical analyses on automatic indexing, which is vital in the process of constructing a thesaurus browser, and clustering algorithms to construct hierarchical relations among terms as well as the methods for the automatic construction of a thesaurus browser. The methods to select the index term automatically in the web documents are studied by surveying the methods for analyzing and processing metadata which conforms to bibliographical roles of traditional paper documents in web documents. Also, the result of the study suggests to adding or involving the metadata in web documents, using the metadata automatic editor because metadata is not listed in most of the web documents.

  • PDF

Unconstrained Object Segmentation Using GrabCut Based on Automatic Generation of Initial Boundary

  • Na, In-Seop;Oh, Kang-Han;Kim, Soo-Hyung
    • International Journal of Contents
    • /
    • v.9 no.1
    • /
    • pp.6-10
    • /
    • 2013
  • Foreground estimation in object segmentation has been an important issue for last few decades. In this paper we propose a GrabCut based automatic foreground estimation method using block clustering. GrabCut is one of popular algorithms for image segmentation in 2D image. However GrabCut is semi-automatic algorithm. So it requires the user input a rough boundary for foreground and background. Typically, the user draws a rectangle around the object of interest manually. The goal of proposed method is to generate an initial rectangle automatically. In order to create initial rectangle, we use Gabor filter and Saliency map and then we use 4 features (amount of area, variance, amount of class with boundary area, amount of class with saliency map) to categorize foreground and background. From the experimental results, our proposed algorithm can achieve satisfactory accuracy in object segmentation without any prior information by the user.

Typical Daily Load Profile Generation using Load Profile of Automatic Meter Reading Customer (자동검침 고객의 부하패턴을 이용한 일일 대표 부하패턴 생성)

  • Kim, Young-Il;Shin, Jin-Ho;Yi, Bong-Jae;Yang, Il-Kwon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.57 no.9
    • /
    • pp.1516-1521
    • /
    • 2008
  • Recently, distribution load analysis using AMR (Automatic Meter Reading) data is researched in electric utilities. Load analysis method based on AMR system generates the typical load profile using load data of AMR customers, estimates the load profile of non-AMR customers, and analyzes the peak load and load profile of the distribution circuits and sectors per every 15 minutes/hour/day/week/month. Typical load profile is generated by the algorithm calculating the average amount of power consumption of each groups having similar load patterns. Traditional customer clustering mechanism uses only contract type code as a key. This mechanism has low accuracy because many customers having same contract code have different load patterns. In this research, We propose a customer clustring mechanism using k-means algorithm with contract type code and AMR data.

Automatic Target Detection Using the Extended Fuzzy Clustering (확장된 Fuzzy Clustering 알고리즘을 이용한 자동 목표물 검출)

  • 김수환;강경진;이태원
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.10
    • /
    • pp.842-913
    • /
    • 1991
  • The automatic target detection which automatically identifies the location of the target with its input image is one of the significant subjects of image processing field. Then, there are some problems that should be solved to detect the target automatically from the input image. First of all, the ambiguity of the boundary between targets or between a target and background should be solved and the target should be searched adaptively. In other words, the target should be identified by the relative brightness to the background, not by the absolute brightness. In this paper, to solve these problems, a new algorithm which can identify the target automatically is proposed. This algorithm uses the set of fuzzy for solving the ambiguity between the boundaries, and using the weight according to the brightness of data in the input image, the target is identified adaptively by the relative brightness to the background. Applying this algorithm to real images, it is experimentally proved that it is can be effectively applied to the automatic target detection.

  • PDF