• 제목/요약/키워드: Time-based Clustering

검색결과 716건 처리시간 0.028초

Clustering Algorithm using a Center Of Gravity for Grid-based Sample

  • 박희창;유지현
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2003년도 춘계학술대회
    • /
    • pp.77-88
    • /
    • 2003
  • Cluster analysis has been widely used in many applications, such that data analysis, pattern recognition, image processing, etc. But clustering requires many hours to get clusters that we want, because it is more primitive, explorative and we make many data an object of cluster analysis. In this paper we propose a new clustering method, 'Clustering algorithm using a center of gravity for grid-based sample'. It is more fast than any traditional clustering method and maintains accuracy. It reduces running time by using grid-based sample and keeps accuracy by using representative point, a center of gravity.

  • PDF

Unsupervised Clustering of Multivariate Time Series Microarray Experiments based on Incremental Non-Gaussian Analysis

  • Ng, Kam Swee;Yang, Hyung-Jeong;Kim, Soo-Hyung;Kim, Sun-Hee;Anh, Nguyen Thi Ngoc
    • International Journal of Contents
    • /
    • 제8권1호
    • /
    • pp.23-29
    • /
    • 2012
  • Multiple expression levels of genes obtained using time series microarray experiments have been exploited effectively to enhance understanding of a wide range of biological phenomena. However, the unique nature of microarray data is usually in the form of large matrices of expression genes with high dimensions. Among the huge number of genes presented in microarrays, only a small number of genes are expected to be effective for performing a certain task. Hence, discounting the majority of unaffected genes is the crucial goal of gene selection to improve accuracy for disease diagnosis. In this paper, a non-Gaussian weight matrix obtained from an incremental model is proposed to extract useful features of multivariate time series microarrays. The proposed method can automatically identify a small number of significant features via discovering hidden variables from a huge number of features. An unsupervised hierarchical clustering representative is then taken to evaluate the effectiveness of the proposed methodology. The proposed method achieves promising results based on predictive accuracy of clustering compared to existing methods of analysis. Furthermore, the proposed method offers a robust approach with low memory and computation costs.

Volatility clustering in data breach counts

  • Shim, Hyunoo;Kim, Changki;Choi, Yang Ho
    • Communications for Statistical Applications and Methods
    • /
    • 제27권4호
    • /
    • pp.487-500
    • /
    • 2020
  • Insurers face increasing demands for cyber liability; entailed in part by a variety of new forms of risk of data breaches. As data breach occurrences develop, our understanding of the volatility in data breach counts has also become important as well as its expected occurrences. Volatility clustering, the tendency of large changes in a random variable to cluster together in time, are frequently observed in many financial asset prices, asset returns, and it is questioned whether the volatility of data breach occurrences are also clustered in time. We now present volatility analysis based on INGARCH models, i.e., integer-valued generalized autoregressive conditional heteroskedasticity time series model for frequency counts due to data breaches. Using the INGARCH(1, 1) model with data breach samples, we show evidence of temporal volatility clustering for data breaches. In addition, we present that the firms' volatilities are correlated between some they belong to and that such a clustering effect remains even after excluding the effect of financial covariates such as the VIX and the stock return of S&P500 that have their own volatility clustering.

데이타마이닝에서 고차원 대용량 데이타를 위한 셀-기반 클러스터 링 방법 (A Cell-based Clustering Method for Large High-dimensional Data in Data Mining)

  • 진두석;장재우
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제28권4호
    • /
    • pp.558-567
    • /
    • 2001
  • 최근 데이타마이닝 응용분야에서는 고차원 대용량 데이타가 요구되고 있다. 그러나 기존의 대부분의 데이타마이닝을 위한 알고리즘들은 소위 차원의 저주(dimensionality curse)[1] 문제점과 이용 가 능한 메모리의 한계 때문에 고차원 대용량 데이타에는 비효율적이다. 따라서, 본 논문에서는 이러한 문제 점을 해결하기 위해서 셀-기반 클러스터링 방법을 제안한다. 제안하는 진-기반 클러스터링 방법은 고차원 대용량 데이타를 효율적으로 처리하기 위한 셀 구성 알고리즘과 필터링에 기반한 저장인덱스 구조를 제공 한다. 본 논문에서 제안한 셀-기반 클러스터링 방법을 (CLQUE 방법과 클러스터링 시간, 정확율, 검색시 간 관점에서 성능을 비교한다. 마지막으로, 실험결과 제안하는 셀-기반 클러스터링 방법이 CLIQUE 방법 에 비해 성능이 우수함을 보인다

  • PDF

준 실시간 뉴스 이슈 분석을 위한 계층적·점증적 군집화 (Hierarchical and Incremental Clustering for Semi Real-time Issue Analysis on News Articles)

  • 김호용;이승우;장홍준;서동민
    • 한국콘텐츠학회논문지
    • /
    • 제20권6호
    • /
    • pp.556-578
    • /
    • 2020
  • 실시간으로 발생하는 뉴스 기사로부터 이슈를 분석하기 위한 다양한 연구가 진행되어 왔다. 하지만 범주에 따라 계층적으로 이슈를 분석하는 연구는 많이 진행되지 않았고, 계층적 이슈 분석을 위한 기존의 연구에서 제안하는 방식 또한 뉴스 기사 증가에 따라 군집화 속도가 느려지는 문제점이 있다. 따라서 본 논문에서는 준 실시간으로 뉴스 기사의 이슈를 분석하는 계층적·점증적 군집화 방식을 제안한다. 제안하는 군집화 방식은 샴 신경망을 이용한 가중 코사인 유사도 측정 모델 기반의 k-평균 알고리즘을 이용한 단어 군집 기반 문서 표현 방식을 통해 뉴스 기사를 문서 벡터로 표현한다. 그리고 문서 벡터로부터 초기 이슈 군집 트리를 생성하고, 새로 발생한 뉴스 기사를 해당 이슈 군집 트리에 추가하는 점증적 군집화 방식을 제안함으로써 뉴스 기사의 계층적 이슈를 준 실시간으로 분석한다. 마지막으로, 본 논문에서 제안하는 방식과 기존 방식들과의 성능평가를 통해 제안하는 군집화 방식이 정확도 측면에서 기존 방식 대비 NMI 지표 기준 0.26 정도 성능이 향상되었고, 속도 측면에서 약 10배 이상의 성능이 향상됨을 입증하였다.

센서 네트워크를 위한 싱크 위치 기반의 적응적 클러스터링 프로토콜 (An Adaptive Clustering Protocol Based on Position of Base-Station for Sensor Networks)

  • 국중진;박영충;박병하;홍지만
    • 한국컴퓨터정보학회논문지
    • /
    • 제16권12호
    • /
    • pp.247-255
    • /
    • 2011
  • 무선 센서 네트워크에서 클러스터 기반의 계층적 라우팅 프로토콜들은 모든 노드들의 수명을 균등하게 유지하여, 센서 네트워크의 수명을 최대로 연장하는 것을 목표로 하고 있다. 본 논문에서는 싱크의 위치 변화를 고려한 적응적 클러스터링 프로토콜을 제안한다. 본 논문에서 제안하는 클러스터링 프로토콜의 특징은 클러스터 트리의 레벨에 따라 클러스터의 크기를 제한하는 대칭형 계층적 클러스터를 구성함으로써 싱크의 위치 변화에 적응적으로 대응 가능하며, 모든 클러스터의 생존 시간을 향상시킴과 동시에 균등한 생존 시간을 보장할 수 있다. 이 기법의 효율성을 입증하기 위해 기존의 대표적인 클러스터링 프로토콜들인 LEACH, EEUC와 본 논문에서 제안하는 적응적 클러스터링 프로토콜의 에너지 소비 정도를 시뮬레이션을 통해 비교하였으며, 그 결과 에너지 소비와 네트워크 수명의 균형에 대해 더 나은 성능을 얻어낼 수 있었다.

Review on Energy Efficient Clustering based Routing Protocol

  • Kanu Patel;Hardik Modi
    • International Journal of Computer Science & Network Security
    • /
    • 제23권10호
    • /
    • pp.169-178
    • /
    • 2023
  • Wireless sensor network is wieldy use for IoT application. The sensor node consider as physical device in IoT architecture. This all sensor node are operated with battery so the power consumption is very high during the data communication and low during the sensing the environment. Without proper planning of data communication the network might be dead very early so primary objective of the cluster based routing protocol is to enhance the battery life and run the application for longer time. In this paper we have comprehensive of twenty research paper related with clustering based routing protocol. We have taken basic information, network simulation parameters and performance parameters for the comparison. In particular, we have taken clustering manner, node deployment, scalability, data aggregation, power consumption and implementation cost many more points for the comparison of all 20 protocol. Along with basic information we also consider the network simulation parameters like number of nodes, simulation time, simulator name, initial energy and communication range as well energy consumption, throughput, network lifetime, packet delivery ration, jitter and fault tolerance parameters about the performance parameters. Finally we have summarize the technical aspect and few common parameter must be fulfill or consider for the design energy efficient cluster based routing protocol.

An expanded Matrix Factorization model for real-time Web service QoS prediction

  • Hao, Jinsheng;Su, Guoping;Han, Xiaofeng;Nie, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권11호
    • /
    • pp.3913-3934
    • /
    • 2021
  • Real-time prediction of Web service of quality (QoS) provides more convenience for web services in cloud environment, but real-time QoS prediction faces severe challenges, especially under the cold-start situation. Existing literatures of real-time QoS predicting ignore that the QoS of a user/service is related to the QoS of other users/services. For example, users/services belonging to the same group of category will have similar QoS values. All of the methods ignore the group relationship because of the complexity of the model. Based on this, we propose a real-time Matrix Factorization based Clustering model (MFC), which uses category information as a new regularization term of the loss function. Specifically, in order to meet the real-time characteristic of the real-time prediction model, and to minimize the complexity of the model, we first map the QoS values of a large number of users/services to a lower-dimensional space by the PCA method, and then use the K-means algorithm calculates user/service category information, and use the average result to obtain a stable final clustering result. Extensive experiments on real-word datasets demonstrate that MFC outperforms other state-of-the-art prediction algorithms.

Evaluating the Performance of Four Selections in Genetic Algorithms-Based Multispectral Pixel Clustering

  • Kutubi, Abdullah Al Rahat;Hong, Min-Gee;Kim, Choen
    • 대한원격탐사학회지
    • /
    • 제34권1호
    • /
    • pp.151-166
    • /
    • 2018
  • This paper compares the four selections of performance used in the application of genetic algorithms (GAs) to automatically optimize multispectral pixel cluster for unsupervised classification from KOMPSAT-3 data, since the selection among three main types of operators including crossover and mutation is the driving force to determine the overall operations in the clustering GAs. Experimental results demonstrate that the tournament selection obtains a better performance than the other selections, especially for both the number of generation and the convergence rate. However, it is computationally more expensive than the elitism selection with the slowest convergence rate in the comparison, which has less probability of getting optimum cluster centers than the other selections. Both the ranked-based selection and the proportional roulette wheel selection show similar performance in the average Euclidean distance using the pixel clustering, even the ranked-based is computationally much more expensive than the proportional roulette. With respect to finding global optimum, the tournament selection has higher potential to reach the global optimum prior to the ranked-based selection which spends a lot of computational time in fitness smoothing. The tournament selection-based clustering GA is used to successfully classify the KOMPSAT-3 multispectral data achieving the sufficient the matic accuracy assessment (namely, the achieved Kappa coefficient value of 0.923).

Security Clustering Algorithm Based on Integrated Trust Value for Unmanned Aerial Vehicles Network

  • Zhou, Jingxian;Wang, Zengqi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권4호
    • /
    • pp.1773-1795
    • /
    • 2020
  • Unmanned aerial vehicles (UAVs) network are a very vibrant research area nowadays. They have many military and civil applications. Limited bandwidth, the high mobility and secure communication of micro UAVs represent their three main problems. In this paper, we try to address these problems by means of secure clustering, and a security clustering algorithm based on integrated trust value for UAVs network is proposed. First, an improved the k-means++ algorithm is presented to determine the optimal number of clusters by the network bandwidth parameter, which ensures the optimal use of network bandwidth. Second, we considered variables representing the link expiration time to improve node clustering, and used the integrated trust value to rapidly detect malicious nodes and establish a head list. Node clustering reduce impact of high mobility and head list enhance the security of clustering algorithm. Finally, combined the remaining energy ratio, relative mobility, and the relative degrees of the nodes to select the best cluster head. The results of a simulation showed that the proposed clustering algorithm incurred a smaller computational load and higher network security.