• Title/Summary/Keyword: Time-based Clustering

Search Result 716, Processing Time 0.022 seconds

Comparison of time series clustering methods and application to power consumption pattern clustering

  • Kim, Jaehwi;Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.6
    • /
    • pp.589-602
    • /
    • 2020
  • The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.

A Study on Density-Based Clustering Method Considering Directionality (방향성을 고려한 밀도 기반 클러스터링 기법에 관한 연구)

  • Jinman Kim;Joongjin Kook
    • Journal of the Semiconductor & Display Technology
    • /
    • v.23 no.2
    • /
    • pp.38-44
    • /
    • 2024
  • This research proposed DBSCAN-D, which is a clustering technique for locating POI based on existing density-based clustering research, such as GPS data, generated by moving objects. This method is designed based on 'staying time' and 'directionality' extracted from the relationship between GPS data. The staying time can be extracted through the difference in the reception time between data using the time at which the GPS data is received. Directionality can be expressed by moving the area of data generated later in the direction of the position of the previously generated data by concentrating on the point where the GPS data is sequentially generated. Through these two properties, it is possible to perform clustering suitable for the data set generated by the moving object.

  • PDF

A Determination of an Optimal Clustering Method Based on Data Characteristics

  • Kim, Jeong-Hun;Yoo, Kwan-Hee;Nasridinov, Aziz
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.8
    • /
    • pp.305-314
    • /
    • 2017
  • Clustering is a method that collects data objects into groups based on their similary. Performance of the state-of-the-art clustering methods is different according to the data characteristics. There have been numerous studies that performed experiments to compare the accuracy of the state-of-the-art clustering methods by applying various kinds of datasets. A common problem of these studies is that they only consider clustering algorithms that yield the most accurate results for a particular dataset. They do not consider what factors affect the execution time of each clustering method and how they are affected. Nevertheless, execution time is an important factor in clustering performance if there is no significant difference in accuracy. In order to solve the problems of the existing research, through a series of experiments using various types of datasets, we compare the accuracy of four representative clustering methods. In addition, we perform practical clustering performance comparisons by deriving time complexity and identifying factors that influences to its performance.

A study on electricity demand forecasting based on time series clustering in smart grid (스마트 그리드에서의 시계열 군집분석을 통한 전력수요 예측 연구)

  • Sohn, Hueng-Goo;Jung, Sang-Wook;Kim, Sahm
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.1
    • /
    • pp.193-203
    • /
    • 2016
  • This paper forecasts electricity demand as a critical element of a demand management system in Smart Grid environment. We present a prediction method of using a combination of predictive values by time series clustering. Periodogram-based normalized clustering, predictive analysis clustering and dynamic time warping (DTW) clustering are proposed for time series clustering methods. Double Seasonal Holt-Winters (DSHW), Trigonometric, Box-Cox transform, ARMA errors, Trend and Seasonal components (TBATS), Fractional ARIMA (FARIMA) are used for demand forecasting based on clustering. Results show that the time series clustering method provides a better performances than the method using total amount of electricity demand in terms of the Mean Absolute Percentage Error (MAPE).

Consensus Clustering for Time Course Gene Expression Microarray Data

  • Kim, Seo-Young;Bae, Jong-Sung
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.335-348
    • /
    • 2005
  • The rapid development of microarray technologies enabled the monitoring of expression levels of thousands of genes simultaneously. Recently, the time course gene expression data are often measured to study dynamic biological systems and gene regulatory networks. For the data, biologists are attempting to group genes based on the temporal pattern of their expression levels. We apply the consensus clustering algorithm to a time course gene expression data in order to infer statistically meaningful information from the measurements. We evaluate each of consensus clustering and existing clustering methods with various validation measures. In this paper, we consider hierarchical clustering and Diana of existing methods, and consensus clustering with hierarchical clustering, Diana and mixed hierachical and Diana methods and evaluate their performances on a real micro array data set and two simulated data sets.

Path based K-means Clustering for RFID Data Sets

  • Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • v.6 no.4
    • /
    • pp.434-438
    • /
    • 2008
  • Massive data are continuously produced with a data rate of over several terabytes every day. These applications need effective clustering algorithms to achieve an overall high performance computation. In this paper, we propose ancestor as cluster center based approach to clustering, the K-means algorithm using ancestor. We modify the K-means algorithm. We present a clustering architecture and a clustering algorithm that minimize of I/Os and show a performance with excellent. In our experimental performance evaluation, we present that our algorithm can improve the I/O speed and the query processing time.

Clustering non-stationary advanced metering infrastructure data

  • Kang, Donghyun;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.225-238
    • /
    • 2022
  • In this paper, we propose a clustering method for advanced metering infrastructure (AMI) data in Korea. As AMI data presents non-stationarity, we consider time-dependent frequency domain principal components analysis, which is a proper method for locally stationary time series data. We develop a new clustering method based on time-varying eigenvectors, and our method provides a meaningful result that is different from the clustering results obtained by employing conventional methods, such as K-means and K-centres functional clustering. Simulation study demonstrates the superiority of the proposed approach. We further apply the clustering results to the evaluation of the electricity price system in South Korea, and validate the reform of the progressive electricity tariff system.

Improving Real-Time Efficiency of Case Retrieving Process for Case-Based Reasoning

  • Park, Yoon-Joo
    • Asia pacific journal of information systems
    • /
    • v.25 no.4
    • /
    • pp.626-641
    • /
    • 2015
  • Conventional case-based reasoning (CBR) does not perform efficiently for high-volume datasets because of case retrieval time. To overcome this problem, previous research suggested clustering a case base into several small groups and retrieving neighbors within a corresponding group to a target case. However, this approach generally produces less accurate predictive performance than the conventional CBR. This paper proposes a new case-based reasoning method called the clustering-merging CBR (CM-CBR). The CM-CBR method dynamically indexes a search pool to retrieve neighbors considering the distance between a target case and the centroid of a corresponding cluster. This method is applied to three real-life medical datasets. Results show that the proposed CM-CBR method produces similar or better predictive performance than the conventional CBR and clustering-CBR methods in numerous cases with significantly less computational cost.

AN EFFICIENT DENSITY BASED ANT COLONY APPROACH ON WEB DOCUMENT CLUSTERING

  • M. REKA
    • Journal of applied mathematics & informatics
    • /
    • v.41 no.6
    • /
    • pp.1327-1339
    • /
    • 2023
  • World Wide Web (WWW) use has been increasing recently due to users needing more information. Lately, there has been a growing trend in the document information available to end users through the internet. The web's document search process is essential to find relevant documents for user queries.As the number of general web pages increases, it becomes increasingly challenging for users to find records that are appropriate to their interests. However, using existing Document Information Retrieval (DIR) approaches is time-consuming for large document collections. To alleviate the problem, this novel presents Spatial Clustering Ranking Pattern (SCRP) based Density Ant Colony Information Retrieval (DACIR) for user queries based DIR. The proposed first stage is the Term Frequency Weight (TFW) technique to identify the query weightage-based frequency. Based on the weight score, they are grouped and ranked using the proposed Spatial Clustering Ranking Pattern (SCRP) technique. Finally, based on ranking, select the most relevant information retrieves the document using DACIR algorithm.The proposed method outperforms traditional information retrieval methods regarding the quality of returned objects while performing significantly better in run time.

Evolutionary Computation-based Hybird Clustring Technique for Manufacuring Time Series Data (제조 시계열 데이터를 위한 진화 연산 기반의 하이브리드 클러스터링 기법)

  • Oh, Sanghoun;Ahn, Chang Wook
    • Smart Media Journal
    • /
    • v.10 no.3
    • /
    • pp.23-30
    • /
    • 2021
  • Although the manufacturing time series data clustering technique is an important grouping solution in the field of detecting and improving manufacturing large data-based equipment and process defects, it has a disadvantage of low accuracy when applying the existing static data target clustering technique to time series data. In this paper, an evolutionary computation-based time series cluster analysis approach is presented to improve the coherence of existing clustering techniques. To this end, first, the image shape resulting from the manufacturing process is converted into one-dimensional time series data using linear scanning, and the optimal sub-clusters for hierarchical cluster analysis and split cluster analysis are derived based on the Pearson distance metric as the target of the transformation data. Finally, by using a genetic algorithm, an optimal cluster combination with minimal similarity is derived for the two cluster analysis results. And the performance superiority of the proposed clustering is verified by comparing the performance with the existing clustering technique for the actual manufacturing process image.