• Title/Summary/Keyword: Clustering Problem

Search Result 710, Processing Time 0.024 seconds

Materialized View Selection Algorithm using Clustering Technique in Data Warehouse (데이터 웨어하우스에서 클러스터링 기법을 이용한 실체화 뷰 선택 알고리즘)

  • Yang, Jin-Hyuk;Chung, In-Jeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2273-2286
    • /
    • 2000
  • In order to acquire the precise and fast response for an analytical query, proper selection of the views to materialize in data warehouse is very crucial. In traditional view selection algorithms, the whole relations are considered to be selected as materialized views. However, materializing the whole relations rather than a part of relations results in much worse performance in terms of time and space cost. Therefore, we present an improved algorithm for selection of views to materialize using clustering method to overcome the problem resulted from conventional view selection algorithms. In the presented algorithm, ASVMRT(Algorithm for Selection of Views to daterialize using Iteduced Table). we first generate reduced tables in clata warehouse using automatic clustering based on attrihute-values density, then we consider the combination of reduced tables as materialized views instead of the combination of the original hase relations. For the justification of the proposecl algorithm. we show the experimental results in which both time and space cost are approximately 1.8 times better than the conventional algorithms.

  • PDF

Determination of Optimal Cluster Size Using Bootstrap and Genetic Algorithm (붓스트랩 기법과 유전자 알고리즘을 이용한 최적 군집 수 결정)

  • Park, Min-Jae;Jun, Sung-Hae;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.12-17
    • /
    • 2003
  • Optimal determination of cluster size has an effect on the result of clustering. In K-means algorithm, the difference of clustering performance is large by initial K. But the initial cluster size is determined by prior knowledge or subjectivity in most clustering process. This subjective determination may not be optimal. In this Paper, the genetic algorithm based optimal determination approach of cluster size is proposed for automatic determination of cluster size and performance upgrading of its result. The initial population based on attribution is generated for searching optimal cluster size. The fitness value is defined the inverse of dissimilarity summation. So this is converged to upgraded total performance. The mutation operation is used for local minima problem. Finally, the re-sampling of bootstrapping is used for computational time cost.

Initialization of Fuzzy C-Means Using Kernel Density Estimation (커널 밀도 추정을 이용한 Fuzzy C-Means의 초기화)

  • Heo, Gyeong-Yong;Kim, Kwang-Baek
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.8
    • /
    • pp.1659-1664
    • /
    • 2011
  • Fuzzy C-Means (FCM) is one of the most widely used clustering algorithms and has been used in many applications successfully. However, FCM has some shortcomings and initial prototype selection is one of them. As FCM is only guaranteed to converge on a local optimum, different initial prototype results in different clustering. Therefore, much care should be given to the selection of initial prototype. In this paper, a new initialization method for FCM using kernel density estimation (KDE) is proposed to resolve the initialization problem. KDE can be used to estimate non-parametric data distribution and is useful in estimating local density. After KDE, in the proposed method, one initial point is placed at the most dense region and the density of that region is reduced. By iterating the process, initial prototype can be obtained. The initial prototype such obtained showed better result than the randomly selected one commonly used in FCM, which was demonstrated by experimental results.

Function Approximation for Reinforcement Learning using Fuzzy Clustering (퍼지 클러스터링을 이용한 강화학습의 함수근사)

  • Lee, Young-Ah;Jung, Kyoung-Sook;Chung, Tae-Choong
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.587-592
    • /
    • 2003
  • Many real world control problems have continuous states and actions. When the state space is continuous, the reinforcement learning problems involve very large state space and suffer from memory and time for learning all individual state-action values. These problems need function approximators that reason action about new state from previously experienced states. We introduce Fuzzy Q-Map that is a function approximators for 1 - step Q-learning and is based on fuzzy clustering. Fuzzy Q-Map groups similar states and chooses an action and refers Q value according to membership degree. The centroid and Q value of winner cluster is updated using membership degree and TD(Temporal Difference) error. We applied Fuzzy Q-Map to the mountain car problem and acquired accelerated learning speed.

A new Clustering Algorithm for GPS Trajectories with Maximum Overlap Interval (최대 중첩구간을 이용한 새로운 GPS 궤적 클러스터링)

  • Kim, Taeyong;Park, Bokuk;Park, Jinkwan;Cho, Hwan-Gue
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.9
    • /
    • pp.419-425
    • /
    • 2016
  • In navigator systems, keeping map data up-to-date is an important task. Manual update involves a substantial cost and it is difficult to achieve immediate reflection of changes with manual updates. In this paper, we present a method for trajectory-center extraction, which is essential for automatic road map generation with GPS data. Though clustered trajectories are necessary to extract the center road, real trajectories are not clustered. To address this problem, this paper proposes a new method using the maximum overlapping interval and trajectory clustering. Finally, we apply the Virtual Running method to extract the center road from the clustered trajectories. We conducted experiments on real massive taxi GPS data sets collected throughout Gang-Nam-Gu, Sung-Nam city and all parts of Seoul city. Experimental results showed that our method is stable and efficient for extracting the center trajectory of real roads.

Cluster-based Energy-aware Data Sharing Scheme to Support a Mobile Sink in Solar-Powered Wireless Sensor Networks (태양 에너지 수집형 센서 네트워크에서 모바일 싱크를 지원하기 위한 클러스터 기반 에너지 인지 데이터 공유 기법)

  • Lee, Hong Seob;Yi, Jun Min;Kim, Jaeung;Noh, Dong Kun
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1430-1440
    • /
    • 2015
  • In contrast with battery-based wireless sensor networks (WSNs), solar-powered WSNs can operate for a longtime assuming that there is no hardware fault. Meanwhile, a mobile sink can save the energy consumption of WSN, but its ineffective movement may incur so much energy waste of not only itself but also an entire network. To solve this problem, many approaches, in which a mobile sink visits only on clustering-head nodes, have been proposed. But, the clustering scheme also has its own problems such as energy imbalance and data instability. In this study, therefore, a cluster-based energy-aware data-sharing scheme (CE-DSS) is proposed to effectively support a mobile sink in a solar-powered WSN. By utilizing the redundant energy efficiently, CE-DSS shares the gathered data among cluster-heads, while minimizing the unexpected black-out time. The simulation results show that CE-DSS increases the data reliability as well as conserves the energy of the mobile sink.

Device-to-Device assisted user clustering for Multiple Access in MIMO WLAN

  • Hongyi, Zhao;Weimin, Wu;li, Lu;Yingzhuang, Liu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.7
    • /
    • pp.2972-2991
    • /
    • 2016
  • WLAN is the best choice in the place where complex network is hard to set up. Intelligent terminals are more and more assembled in some areas now. However, according to IEEE 802.11n/802.11ac, the access-point (AP) can only serve one user at a single frequency channel. The spectrum efficiency urgently needs to be improved. In theory, AP with multi-antenna can serve multiple users if these users do not interfere with each other. In this paper, we propose a user clustering scheme that could achieve multi-user selection through the mutual cooperation among users. We focus on two points, one is to achieve multi-user communication with multiple antennas technique at a single frequency channel, and the other one is to use a way of distributed users' collaboration to determine the multi-user selection for user clustering. Firstly, we use the CSMA/CA protocol to select the first user, and then we set this user as a source node using users' cooperation to search other proper users. With the help of the users' broadcast cooperation, we can search and select other appropriate user (while the number of access users is limited by the number of antennas in AP) to access AP with the first user simultaneously. In the network node searching, we propose a maximum degree energy routing searching algorithm, which uses the shortest time and traverses as many users as possible. We carried out the necessary analysis and simulation to prove the feasibility of the scheme. We hope this work may provide a new idea for the solution of the multiple access problem.

Speech Recognition Performance Improvement using a convergence of GMM Phoneme Unit Parameter and Vocabulary Clustering (GMM 음소 단위 파라미터와 어휘 클러스터링을 융합한 음성 인식 성능 향상)

  • Oh, SangYeob
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.8
    • /
    • pp.35-39
    • /
    • 2020
  • DNN error is small compared to the conventional speech recognition system, DNN is difficult to parallel training, often the amount of calculations, and requires a large amount of data obtained. In this paper, we generate a phoneme unit to estimate the GMM parameters with each phoneme model parameters from the GMM to solve the problem efficiently. And it suggests ways to improve performance through clustering for a specific vocabulary to effectively apply them. To this end, using three types of word speech database was to have a DB build vocabulary model, the noise processing to extract feature with Warner filters were used in the speech recognition experiments. Results using the proposed method showed a 97.9% recognition rate in speech recognition. In this paper, additional studies are needed to improve the problems of improved over fitting.

Decision Tree for Likely phoneme model schema support (유사 음소 모델 스키마 지원을 위한 결정 트리)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.11 no.10
    • /
    • pp.367-372
    • /
    • 2013
  • In Speech recognition system, there is a problem with phoneme in the model training and it cause a stored mode regeneration process which come into being appear time and more costs. In this paper, we propose the methode of likely phoneme model schema using decision tree clustering. Proposed system has a robust and correct sound model which system apply the decision tree clustering methode form generate model, therefore this system reduce the regeneration process and provide a retrieve the phoneme unit in probability model. Also, this proposed system provide a additional likely phoneme model and configured robust correct sound model. System performance as a result of represent vocabulary dependence recognition rate of 98.3%, vocabulary independence recognition rate of 98.4%.

Modified LEACH Protocol improving the Time of Topology Reconfiguration in Container Environment (컨테이너 환경에서 토플로지 재구성 시간을 개선한 변형 LEACH 프로토콜)

  • Lee, Yang-Min;Yi, Ki-One;Kwark, Gwang-Hoon;Lee, Jae-Kee
    • The KIPS Transactions:PartC
    • /
    • v.15C no.4
    • /
    • pp.311-320
    • /
    • 2008
  • In general, routing algorithms that were applied to ad-hoc networks are not suitable for the environment with many nodes over several thousands. To solve this problem, hierarchical management to these nodes and clustering-based protocols for the stable maintenance of topology are used. In this paper, we propose the clustering-based modified LEACH protocol that can applied to an environment which moves around metal containers within communication nodes. In proposed protocol, we implemented a module for detecting the movement of nodes on the clustering-based LEACH protocol and improved the defect of LEACH in an environment with movable nodes. And we showed the possibility of the effective communication by adjusting the configuration method of multi-hop. We also compared the proposed protocol with LEACH in four points of view, which are a gradual network composition time, a reconfiguration time of a topology, a success ratio of communication on an containers environment, and routing overheads. And to conclude, we verified that the proposed protocol is better than original LEACH protocol in the metal containers environment within communication of nodes.