• Title/Summary/Keyword: Two Phase Clustering

Search Result 48, Processing Time 0.024 seconds

Unification of neural network with a hierarchical pattern recognition

  • Park, Chang-Mock;Wang, Gi-Nam
    • Proceedings of the ESK Conference
    • /
    • 1996.10a
    • /
    • pp.197-205
    • /
    • 1996
  • Unification of neural network with a hierarchical pattern recognition is presented for recognizing large set of objects. A two-step identification procedure is developed for pattern recognition: coarse and fine identification. The coarse identification is designed for finding a class of object while the fine identification procedure is to identify a specific object. During the training phase a course neural network is trained for clustering larger set of reference objects into a number of groups. For training a fine neural network, expert neural network is also trained to identify a specific object within a group. The presented idea can be interpreted as two step identification. Experimental results are given to verify the proposed methodology.

  • PDF

Two-Phase Clustering Method Considering Mobile App Trends (모바일 앱 트렌드를 고려한 2단계 군집화 방법)

  • Heo, Jeong-Man;Park, So-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.4
    • /
    • pp.17-23
    • /
    • 2015
  • In this paper, we propose a mobile app clustering method using word clusters. Considering the quick change of mobile app trends, the proposed method divides the mobile apps into some semantically similar mobile apps by applying a clustering algorithm to the mobile app set, rather than the predefined category system. In order to alleviate the data sparseness problem in the short mobile app description texts, the proposed method additionally utilizes the unigram, the bigram, the trigram, the cluster of each word. For the purpose of accurately clustering mobile apps, the proposed method manages to avoid exceedingly small or large mobile app clusters by using the word clusters. Experimental results show that the proposed method improves 22.18% from 57.48% to 79.66% on overall accuracy by using the word clusters.

Cluster-based Delay-adaptive Sensor Scheduling for Energy-saving in Wireless Sensor Networks (센서네트워크에서 클러스터기반의 에너지 효율형 센서 스케쥴링 연구)

  • Choi, Wook;Lee, Yong;Chung, Yoo-Jin
    • Journal of the Korea Society for Simulation
    • /
    • v.18 no.3
    • /
    • pp.47-59
    • /
    • 2009
  • Due to the application-specific nature of wireless sensor networks, the sensitivity to such a requirement as data reporting latency may vary depending on the type of applications, thus requiring application-specific algorithm and protocol design paradigms which help us to maximize energy conservation and thus the network lifetime. In this paper, we propose a novel delay-adaptive sensor scheduling scheme for energy-saving data gathering which is based on a two phase clustering (TPC). The ultimate goal is to extend the network lifetime by providing sensors with high adaptability to the application-dependent and time-varying delay requirements. The TPC requests sensors to construct two types of links: direct and relay links. The direct links are used for control and forwarding time critical sensed data. On the other hand, the relay links are used only for data forwarding based on the user delay constraints, thus allowing the sensors to opportunistically use the most energy-saving links and forming a multi-hop path. Simulation results demonstrate that cluster-based delay-adaptive data gathering strategy (CD-DGS) saves a significant amount of energy for dense sensor networks by adapting to the user delay constraints.

A Geometric Constraint Solver for Parametric Modeling

  • Jae Yeol Lee;Kwangsoo Kim
    • Korean Journal of Computational Design and Engineering
    • /
    • v.3 no.4
    • /
    • pp.211-222
    • /
    • 1998
  • Parametric design is an important modeling paradigm in CAD/CAM applications, enabling efficient design modifications and variations. One of the major issues in parametric design is to develop a geometric constraint solver that can handle a large set of geometric configurations efficiently and robustly. In this appear, we propose a new approach to geometric constraint solving that employs a graph-based method to solve the ruler-and-compass constructible configurations and a numerical method to solve the ruler-and-compass non-constructible configurations, in a way that combines the advantages of both methods. The geometric constraint solving process consists of two phases: 1) planning phase and 2) execution phase. In the planning phase, a sequence of construction steps is generated by clustering the constrained geometric entities and reducing the constraint graph in sequence. in the execution phase, each construction step is evaluated to determine the geometric entities, using both approaches. By combining the advantages of the graph-based constructive approach with the universality of the numerical approach, the proposed approach can maximize the efficiency, robustness, and extensibility of geometric constraint solver.

  • PDF

A MapReduce-Based Workflow BIG-Log Clustering Technique (맵리듀스기반 워크플로우 빅-로그 클러스터링 기법)

  • Jin, Min-Hyuck;Kim, Kwanghoon Pio
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.87-96
    • /
    • 2019
  • In this paper, we propose a MapReduce-supported clustering technique for collecting and classifying distributed workflow enactment event logs as a preprocessing tool. Especially, we would call the distributed workflow enactment event logs as Workflow BIG-Logs, because they are satisfied with as well as well-fitted to the 5V properties of BIG-Data like Volume, Velocity, Variety, Veracity and Value. The clustering technique we develop in this paper is intentionally devised for the preprocessing phase of a specific workflow process mining and analysis algorithm based upon the workflow BIG-Logs. In other words, It uses the Map-Reduce framework as a Workflow BIG-Logs processing platform, it supports the IEEE XES standard data format, and it is eventually dedicated for the preprocessing phase of the ${\rho}$-Algorithm that is a typical workflow process mining algorithm based on the structured information control nets. More precisely, The Workflow BIG-Logs can be classified into two types: of activity-based clustering patterns and performer-based clustering patterns, and we try to implement an activity-based clustering pattern algorithm based upon the Map-Reduce framework. Finally, we try to verify the proposed clustering technique by carrying out an experimental study on the workflow enactment event log dataset released by the BPI Challenges.

Density-Based Estimation of POI Boundaries Using Geo-Tagged Tweets (공간 태그된 트윗을 사용한 밀도 기반 관심지점 경계선 추정)

  • Shin, Won-Yong;Vu, Dung D.
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.453-459
    • /
    • 2017
  • Users tend to check in and post their statuses in location-based social networks (LBSNs) to describe that their interests are related to a point-of-interest (POI). While previous studies on discovering area-of-interests (AOIs) were conducted mostly on the basis of density-based clustering methods with the collection of geo-tagged photos from LBSNs, we focus on estimating a POI boundary, which corresponds to only one cluster containing its POI center. Using geo-tagged tweets recorded from Twitter users, this paper introduces a density-based low-complexity two-phase method to estimate a POI boundary by finding a suitable radius reachable from the POI center. We estimate a boundary of the POI as the convex hull of selected geo-tags through our two-phase density-based estimation, where each phase proceeds with different sizes of radius increment. It is shown that our method outperforms the conventional density-based clustering method in terms of computational complexity.

Improving Data Accuracy Using Proactive Correlated Fuzzy System in Wireless Sensor Networks

  • Barakkath Nisha, U;Uma Maheswari, N;Venkatesh, R;Yasir Abdullah, R
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.9
    • /
    • pp.3515-3538
    • /
    • 2015
  • Data accuracy can be increased by detecting and removing the incorrect data generated in wireless sensor networks. By increasing the data accuracy, network lifetime can be increased parallel. Network lifetime or operational time is the time during which WSN is able to fulfill its tasks by using microcontroller with on-chip memory radio transceivers, albeit distributed sensor nodes send summary of their data to their cluster heads, which reduce energy consumption gradually. In this paper a powerful algorithm using proactive fuzzy system is proposed and it is a mixture of fuzzy logic with comparative correlation techniques that ensure high data accuracy by detecting incorrect data in distributed wireless sensor networks. This proposed system is implemented in two phases there, the first phase creates input space partitioning by using robust fuzzy c means clustering and the second phase detects incorrect data and removes it completely. Experimental result makes transparent of combined correlated fuzzy system (CCFS) which detects faulty readings with greater accuracy (99.21%) than the existing one (98.33%) along with low false alarm rate.

A Sentiment Classification Approach of Sentences Clustering in Webcast Barrages

  • Li, Jun;Huang, Guimin;Zhou, Ya
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.718-732
    • /
    • 2020
  • Conducting sentiment analysis and opinion mining are challenging tasks in natural language processing. Many of the sentiment analysis and opinion mining applications focus on product reviews, social media reviews, forums and microblogs whose reviews are topic-similar and opinion-rich. In this paper, we try to analyze the sentiments of sentences from online webcast reviews that scroll across the screen, which we call live barrages. Contrary to social media comments or product reviews, the topics in live barrages are more fragmented, and there are plenty of invalid comments that we must remove in the preprocessing phase. To extract evaluative sentiment sentences, we proposed a novel approach that clusters the barrages from the same commenter to solve the problem of scattering the information for each barrage. The method developed in this paper contains two subtasks: in the data preprocessing phase, we cluster the sentences from the same commenter and remove unavailable sentences; and we use a semi-supervised machine learning approach, the naïve Bayes algorithm, to analyze the sentiment of the barrage. According to our experimental results, this method shows that it performs well in analyzing the sentiment of online webcast barrages.

Iterative LBG Clustering for SIMO Channel Identification

  • Daneshgaran, Fred;Laddomada, Massimiliano
    • Journal of Communications and Networks
    • /
    • v.5 no.2
    • /
    • pp.157-166
    • /
    • 2003
  • This paper deals with the problem of channel identification for Single Input Multiple Output (SIMO) slow fading channels using clustering algorithms. Due to the intrinsic memory of the discrete-time model of the channel, over short observation periods, the received data vectors of the SIMO model are spread in clusters because of the AWGN noise. Each cluster is practically centered around the ideal channel output labels without noise and the noisy received vectors are distributed according to a multivariate Gaussian distribution. Starting from the Markov SIMO channel model, simultaneous maximum ikelihood estimation of the input vector and the channel coefficients reduce to one of obtaining the values of this pair that minimizes the sum of the Euclidean norms between the received and the estimated output vectors. Viterbi algorithm can be used for this purpose provided the trellis diagram of the Markov model can be labeled with the noiseless channel outputs. The problem of identification of the ideal channel outputs, which is the focus of this paper, is then equivalent to designing a Vector Quantizer (VQ) from a training set corresponding to the observed noisy channel outputs. The Linde-Buzo-Gray (LBG)-type clustering algorithms [1] could be used to obtain the noiseless channel output labels from the noisy received vectors. One problem with the use of such algorithms for blind time-varying channel identification is the codebook initialization. This paper looks at two critical issues with regards to the use of VQ for channel identification. The first has to deal with the applicability of this technique in general; we present theoretical results for the conditions under which the technique may be applicable. The second aims at overcoming the codebook initialization problem by proposing a novel approach which attempts to make the first phase of the channel estimation faster than the classical codebook initialization methods. Sample simulation results are provided confirming the effectiveness of the proposed initialization technique.

Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation

  • Jeon, Hyung-Bae;Lee, Soo-Young
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.487-493
    • /
    • 2016
  • Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domain-specific LMs; therefore, the clustering and weight-estimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting.