• Title/Summary/Keyword: Hierarchical Clustering Model

Search Result 89, Processing Time 0.023 seconds

Hierarchical and Incremental Clustering for Semi Real-time Issue Analysis on News Articles (준 실시간 뉴스 이슈 분석을 위한 계층적·점증적 군집화)

  • Kim, Hoyong;Lee, SeungWoo;Jang, Hong-Jun;Seo, DongMin
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.556-578
    • /
    • 2020
  • There are many different researches about how to analyze issues based on real-time news streams. But, there are few researches which analyze issues hierarchically from news articles and even a previous research of hierarchical issue analysis make clustering speed slower as the increment of news articles. In this paper, we propose a hierarchical and incremental clustering for semi real-time issue analysis on news articles. We trained siamese neural network based weighted cosine similarity model, applied this model to k-means algorithm which is used to make word clusters and converted news articles to document vectors by using these word clusters. Finally, we initialized an issue cluster tree from document vectors, updated this tree whenever news articles happen, and analyzed issues in semi real-time. Through the experiment and evaluation, we showed that up to about 0.26 performance has been improved in terms of NMI. Also, in terms of speed of incremental clustering, we also showed about 10 times faster than before.

Comparisons on Clustering Methods: Use of LMS Log Variables on Academic Courses

  • Jo, Il-Hyun;PARK, Yeonjeong;SONG, Jongwoo
    • Educational Technology International
    • /
    • v.18 no.2
    • /
    • pp.159-191
    • /
    • 2017
  • Academic analytics guides university decision-makers to assign limited resources more effectively. Especially, diverse academic courses clustered by the usage patterns and levels on Learning Management System(LMS) help understanding instructors' pedagogical approach and the integration level of technologies. Further, the clustering results can contribute deciding proper range and levels of financial and technical supports. However, in spite of diverse analytic methodologies, clustering analysis methods often provide different results. The purpose of this study is to present implications by using three different clustering analysis including Gaussian Mixture Model, K-Means clustering, and Hierarchical clustering. As a case, we have clustered academic courses based on the usage levels and patterns of LMS in higher education using those three clustering techniques. In this study, 2,639 courses opened during 2013 fall semester in a large private university located in South Korea were analyzed with 13 observation variables that represent the characteristics of academic courses. The results of analysis show that the strengths and weakness of each clustering analysis and suggest that academic leaders and university staff should look into the usage levels and patterns of LMS with more elaborated view and take an integrated approach with different analytic methods for their strategic decision on development of LMS.

Emergent damage pattern recognition using immune network theory

  • Chen, Bo;Zang, Chuanzhi
    • Smart Structures and Systems
    • /
    • v.8 no.1
    • /
    • pp.69-92
    • /
    • 2011
  • This paper presents an emergent pattern recognition approach based on the immune network theory and hierarchical clustering algorithms. The immune network allows its components to change and learn patterns by changing the strength of connections between individual components. The presented immune-network-based approach achieves emergent pattern recognition by dynamically generating an internal image for the input data patterns. The members (feature vectors for each data pattern) of the internal image are produced by an immune network model to form a network of antibody memory cells. To classify antibody memory cells to different data patterns, hierarchical clustering algorithms are used to create an antibody memory cell clustering. In addition, evaluation graphs and L method are used to determine the best number of clusters for the antibody memory cell clustering. The presented immune-network-based emergent pattern recognition (INEPR) algorithm can automatically generate an internal image mapping to the input data patterns without the need of specifying the number of patterns in advance. The INEPR algorithm has been tested using a benchmark civil structure. The test results show that the INEPR algorithm is able to recognize new structural damage patterns.

Multi-class Support Vector Machines Model Based Clustering for Hierarchical Document Categorization in Big Data Environment (빅 데이터 환경에서 계층적 문서 유형 분류를 위한 클러스터링 기반 다중 SVM 모델)

  • Kim, Young Soo;Lee, Byoung Yup
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.11
    • /
    • pp.600-608
    • /
    • 2017
  • Recently data growth rates are growing exponentially according to the rapid expansion of internet. Since users need some of all the information, they carry a heavy workload for examination and discovery of the necessary contents. Therefore information retrieval must provide hierarchical class information and the priority of examination through the evaluation of similarity on query and documents. In this paper we propose an Multi-class support vector machines model based clustering for hierarchical document categorization that make semantic search possible considering the word co-occurrence measures. A combination of hierarchical document categorization and SVM classifier gives high performance for analytical classification of web documents that increase exponentially according to extension of document hierarchy. More information retrieval systems are expected to use our proposed model in their developments and can perform a accurate and rapid information retrieval service.

A Multiple Model Approach to Fuzzy Modeling and Control of Nonlinear Systems

  • Lee, Chul-Heui;Seo, Seon-Hak;Ha, Young-Ki
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1998.06a
    • /
    • pp.453-458
    • /
    • 1998
  • In this paper, a new approach to modeling of nonlinear systems using fuzzy theory is presented. So as to handle a variety of nonlinearity and reflect the degree of confidence in the informations about system, we combine multiple model method with hierarchical prioritized structure. The mountain clustering technique is used in partition of system, and TSK rule structure is adopted to form the fuzzy rules. Back propagation algorithm is used for learning parameters in the rules. Computer simulations are performed to verify the effectiveness of the proposed method. It is useful for the treatment fo the nonlinear system of which the quantitative math-approach is difficult.

  • PDF

Spatial Region Estimation for Autonomous CoT Clustering Using Hidden Markov Model

  • Jung, Joon-young;Min, Okgee
    • ETRI Journal
    • /
    • v.40 no.1
    • /
    • pp.122-132
    • /
    • 2018
  • This paper proposes a hierarchical dual filtering (HDF) algorithm to estimate the spatial region between a Cloud of Things (CoT) gateway and an Internet of Things (IoT) device. The accuracy of the spatial region estimation is important for autonomous CoT clustering. We conduct spatial region estimation using a hidden Markov model (HMM) with a raw Bluetooth received signal strength indicator (RSSI). However, the accuracy of the region estimation using the validation data is only 53.8%. To increase the accuracy of the spatial region estimation, the HDF algorithm removes the high-frequency signals hierarchically, and alters the parameters according to whether the IoT device moves. The accuracy of spatial region estimation using a raw RSSI, Kalman filter, and HDF are compared to evaluate the effectiveness of the HDF algorithm. The success rate and root mean square error (RMSE) of all regions are 0.538, 0.622, and 0.75, and 0.997, 0.812, and 0.5 when raw RSSI, a Kalman filter, and HDF are used, respectively. The HDF algorithm attains the best results in terms of the success rate and RMSE of spatial region estimation using HMM.

Comparison of clustering with yeast microarray gene expression data (효모 마이크로어레이 유전자발현 데이터에 대한 군집화 비교)

  • Lee, Kyung-A;Kim, Jae-Hee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.741-753
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. We compare model-based clustering, K-means, PAM, SOM and hierarchical Ward method with yeast data. As the validity measure for clustering results, connectivity, Dunn Index and silhouette values are computed and compared.

Power Prediction of Mobile Processors based on Statistical Analysis of Performance Monitoring Events (성능 모니터링 이벤트들의 통계적 분석에 기반한 모바일 프로세서의 전력 예측)

  • Yun, Hee-Sung;Lee, Sang-Jeong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.7
    • /
    • pp.469-477
    • /
    • 2009
  • In mobile systems, energy efficiency is critical to extend battery life. Therefore, power consumption should be taken into account to develop software in addition to performance, Efficient software design in power and performance is possible if accurate power prediction is accomplished during the execution of software, In this paper, power estimation model is developed using statistical analysis, The proposed model analyzes processor behavior Quantitatively using the data of performance monitoring events and power consumption collected by executing various benchmark programs, And then representative hardware events on power consumption are selected using hierarchical clustering, The power prediction model is established by regression analysis in which the selected events are independent variables and power is a response variable, The proposed model is applied to a PXA320 mobile processor based on Intel XScale architecture and shows average estimation error within 4% of the actual measured power consumption of the processor.

Relevance Feedback Method of an Extended Boolean Model using Hierarchical Clustering Techniques (계층적 클러스터링 기법을 이용한 확장 불리언 모델의 적합성 피드백 방법)

  • 최종필;김민구
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1374-1385
    • /
    • 2004
  • The relevance feedback process uses information obtained from a user about an initially retrieved set of documents to improve subsequent search formulations and retrieval performance. In the extended Boolean model, the relevance feedback Implies not only that new query terms must be identified, but also that the terms must be connected with the Boolean AND/OR operators properly Salton et al. proposed a relevance feedback method for the extended Boolean model, called the DNF (disjunctive normal form) method. However, this method has a critical problem in generating a reformulated queries. In this study, we investigate the problem of the DNF method and propose a relevance feedback method using hierarchical clustering techniques to solve the problem. We show the results of experiments which are performed on two data sets: the DOE collection in TREC 1 and the Web TREC 10 collection.

Designing Hierarchical User Interface Model for Browsing the Knowledge Structure of a Single Document Using MDS (MDS를 이용한 개별문서의 계층적 지식구조 브라우징 인터페이스 설계)

  • Han, Seung-Hee;Lee, Jae-Yun
    • Journal of Information Management
    • /
    • v.35 no.3
    • /
    • pp.125-138
    • /
    • 2004
  • The purpose of this study is to propose a hierarchical user interfaces for browsing the knowledge structure of a single document. To generate the hierarchical knowledge structure, hierarchical term clustering and cluster representative term selection were performed with a single thesis in information science field, and the result was applied to design the interfaces which browse a single document hierarchically using multidimensional scaling. The interfaces can be applied to develop the user-friendly information retrieval system.