• Title/Summary/Keyword: Density-based Clustering

Search Result 167, Processing Time 0.03 seconds

A Study on Representative Skyline Using Connected Component Clustering

  • Choi, Jong-Hyeok;Nasridinov, Aziz
    • Journal of Multimedia Information System
    • /
    • v.6 no.1
    • /
    • pp.37-42
    • /
    • 2019
  • Skyline queries are used in a variety of fields to make optimal decisions. However, as the volume of data and the dimension of the data increase, the number of skyline points increases with the amount of time it takes to discover them. Mainly, because the number of skylines is essential in many real-life applications, various studies have been proposed. However, previous researches have used the k-parameter methods such as top-k and k-means to discover representative skyline points (RSPs) from entire skyline point set, resulting in high query response time and reduced representativeness due to k dependency. To solve this problem, we propose a new Connected Component Clustering based Representative Skyline Query (3CRS) that can discover RSP quickly even in high-dimensional data through connected component clustering. 3CRS performs fast discovery and clustering of skylines through hash indexes and connected components and selects RSPs from each cluster. This paper proves the superiority of the proposed method by comparing it with representative skyline queries using k-means and DBSCAN with the real-world dataset.

A Token Based Clustering Algorithm Considering Uniform Density Cluster in Wireless Sensor Networks (무선 센서 네트워크에서 균등한 클러스터 밀도를 고려한 토큰 기반의 클러스터링 알고리즘)

  • Lee, Hyun-Seok;Heo, Jeong-Seok
    • The KIPS Transactions:PartC
    • /
    • v.17C no.3
    • /
    • pp.291-298
    • /
    • 2010
  • In wireless sensor networks, energy is the most important consideration because the lifetime of the sensor node is limited by battery. The clustering is the one of methods used to manage network energy consumption efficiently and LEACH(Low-Energy Adaptive Clustering Hierarchy) is one of the most famous clustering algorithms. LEACH utilizes randomized rotation of cluster-head to evenly distribute the energy load among the sensor nodes in the network. The random selection method of cluster-head does not guarantee the number of cluster-heads produced in each round to be equal to expected optimal value. And, the cluster head in a high-density cluster has an overload condition. In this paper, we proposed both a token based cluster-head selection algorithm for guarantee the number of cluster-heads and a cluster selection algorithm for uniform-density cluster. Through simulation, it is shown that the proposed algorithm improve the network lifetime about 9.3% better than LEACH.

Customer Load Pattern Analysis using Clustering Techniques (클러스터링 기법을 이용한 수용가별 전력 데이터 패턴 분석)

  • Ryu, Seunghyoung;Kim, Hongseok;Oh, Doeun;No, Jaekoo
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.2 no.1
    • /
    • pp.61-69
    • /
    • 2016
  • Understanding load patterns and customer classification is a basic step in analyzing the behavior of electricity consumers. To achieve that, there have been many researches about clustering customers' daily load data. Nowadays, the deployment of advanced metering infrastructure (AMI) and big-data technologies make it easier to study customers' load data. In this paper, we study load clustering from the view point of yearly and daily load pattern. We compare four clustering methods; K-means clustering, hierarchical clustering (average & Ward's method) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise). We also discuss the relationship between clustering results and Korean Standard Industrial Classification that is one of possible labels for customers' load data. We find that hierarchical clustering with Ward's method is suitable for clustering load data and KSIC can be well characterized by daily load pattern, but not quite well by yearly load pattern.

Performance Analysis of User Clustering Algorithms against User Density and Maximum Number of Relays for D2D Advertisement Dissemination (최대 전송횟수 제한 및 사용자 밀집도 변화에 따른 사용자 클러스터링 알고리즘 별 D2D 광고 확산 성능 분석)

  • Han, Seho;Kim, Junseon;Lee, Howon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.4
    • /
    • pp.721-727
    • /
    • 2016
  • In this paper, in order to resolve the problem of reduction for D2D (device to device) advertisement dissemination efficiency of conventional dissemination algorithms, we here propose several clustering algorithms (modified single linkage algorithm (MSL), K-means algorithm, and expectation maximization algorithm with Gaussian mixture model (EM)) based advertisement dissemination algorithms to improve advertisement dissemination efficiency in D2D communication networks. Target areas are clustered in several target groups by the proposed clustering algorithms. Then, D2D advertisements are consecutively distributed by using a routing algorithm based on the geographical distribution of the target areas and a relay selection algorithm based on the distance between D2D sender and D2D receiver. Via intensive MATLAB simulations, we analyze the performance excellency of the proposed algorithms with respect to maximum number of relay transmissions and D2D user density ratio in a target area and a non-target area.

DBSCAN-based Energy-Efficient Algorithm for Base Station Mode Control (에너지 효율성 향상을 위한 DBSCAN 기반 기지국 모드 제어 알고리즘)

  • Lee, Howon;Lee, Wonseok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.12
    • /
    • pp.1644-1649
    • /
    • 2019
  • With the rapid development of mobile communication systems, various mobile convergence services are appearing and data traffic is exploding accordingly. Because the number of base stations to support these surging devices is also increasing, from a network provider's point of view, reducing energy consumption through these mobile communication networks is one of the most important issues. Therefore, in this paper, we apply the DBSCAN (density-based spatial clustering of applications with noise) algorithm, one of the representative user-density based clustering algorithms, in order to extract the dense area with user density and apply the thinning process to each extracted sub-network to efficiently control the mode of the base stations. Extensive simulations show that the proposed algorithm has better performance results than the conventional algorithms with respect to area throughput and energy efficiency.

Data Clustering Method Using a Modified Gaussian Kernel Metric and Kernel PCA

  • Lee, Hansung;Yoo, Jang-Hee;Park, Daihee
    • ETRI Journal
    • /
    • v.36 no.3
    • /
    • pp.333-342
    • /
    • 2014
  • Most hyper-ellipsoidal clustering (HEC) approaches use the Mahalanobis distance as a distance metric. It has been proven that HEC, under this condition, cannot be realized since the cost function of partitional clustering is a constant. We demonstrate that HEC with a modified Gaussian kernel metric can be interpreted as a problem of finding condensed ellipsoidal clusters (with respect to the volumes and densities of the clusters) and propose a practical HEC algorithm that is able to efficiently handle clusters that are ellipsoidal in shape and that are of different size and density. We then try to refine the HEC algorithm by utilizing ellipsoids defined on the kernel feature space to deal with more complex-shaped clusters. The proposed methods lead to a significant improvement in the clustering results over K-means algorithm, fuzzy C-means algorithm, GMM-EM algorithm, and HEC algorithm based on minimum-volume ellipsoids using Mahalanobis distance.

Character Extraction Using Wavelet Transform and Fuzzy Clustering (웨이브렛 변환과 퍼지 군집화를 활용한 문자추출)

  • Hwang, Jung-Won;Hwang, Jae-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.4 s.316
    • /
    • pp.93-100
    • /
    • 2007
  • In this paper, a novel approach based on wavelet transform is proposed to process the scraped character which is represented on digital image. The basis idea is that the scraped character is described by its textured neighborhood, and it is decomposed into multiresolution features at different levels with its background region. The image is first decomposed into sub bands by applying Daubechies wavelets. Character features are extracted from the low frequency sub-bands by partition, FCM clustering and area-based region process. High frequency ones are activated by applying local energy density over a moving mask. Features are synthesized in order to reconstruct the original image state through inverse wavelet transform Background region is eliminated and character is extracted. The experimental results demonstrate the effectiveness of the proposed method.

A Simultaneous Design of TSK - Linguistic Fuzzy Models with Uncertain Fuzzy Output

  • Kwak, Keun-Chang;Kim, Dong-Hwa
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.427-432
    • /
    • 2005
  • This paper is concerned with a simultaneous design of TSK (Takagi-Sugeno-Kang)-linguistic fuzzy models with uncertain model output and the computationally efficient representation. For this purpose, we use the fundamental idea of linguistic models introduced by Pedrycz and develop their comprehensive design framework. The design process consists of several main phases such as (a) the automatic generation of the linguistic contexts by probabilistic distribution using CDF (conditional density function) and PDF (probability density function) (b) performing context-based fuzzy clustering preserving homogeneity based on the concept of fuzzy granulation (c) augment of bias term to compensate bias error (d) combination of TSK and linguistic context in the consequent part. Finally, we contrast the performance of the enhanced models with other fuzzy models for automobile MPG predication data and coagulant dosing process in a water purification plant.

  • PDF

Cluster Merging Using Enhanced Density based Fuzzy C-Means Clustering Algorithm (개선된 밀도 기반의 퍼지 C-Means 알고리즘을 이용한 클러스터 합병)

  • Han, Jin-Woo;Jun, Sung-Hae;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.5
    • /
    • pp.517-524
    • /
    • 2004
  • The fuzzy set theory has been wide used in clustering of machine learning with data mining since fuzzy theory has been introduced in 1960s. In particular, fuzzy C-means algorithm is a popular fuzzy clustering algorithm up to date. An element is assigned to any cluster with each membership value using fuzzy C-means algorithm. This algorithm is affected from the location of initial cluster center and the proper cluster size like a general clustering algorithm as K-means algorithm. This setting up for initial clustering is subjective. So, we get improper results according to circumstances. In this paper, we propose a cluster merging using enhanced density based fuzzy C-means clustering algorithm for solving this problem. Our algorithm determines initial cluster size and center using the properties of training data. Proposed algorithm uses grid for deciding initial cluster center and size. For experiments, objective machine learning data are used for performance comparison between our algorithm and others.

Density-Based Estimation of POI Boundaries Using Geo-Tagged Tweets (공간 태그된 트윗을 사용한 밀도 기반 관심지점 경계선 추정)

  • Shin, Won-Yong;Vu, Dung D.
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.453-459
    • /
    • 2017
  • Users tend to check in and post their statuses in location-based social networks (LBSNs) to describe that their interests are related to a point-of-interest (POI). While previous studies on discovering area-of-interests (AOIs) were conducted mostly on the basis of density-based clustering methods with the collection of geo-tagged photos from LBSNs, we focus on estimating a POI boundary, which corresponds to only one cluster containing its POI center. Using geo-tagged tweets recorded from Twitter users, this paper introduces a density-based low-complexity two-phase method to estimate a POI boundary by finding a suitable radius reachable from the POI center. We estimate a boundary of the POI as the convex hull of selected geo-tags through our two-phase density-based estimation, where each phase proceeds with different sizes of radius increment. It is shown that our method outperforms the conventional density-based clustering method in terms of computational complexity.