• Title/Summary/Keyword: clustering problem

Search Result 710, Processing Time 0.026 seconds

Clustering Algorithm using the DFP-Tree based on the MapReduce (맵리듀스 기반 DFP-Tree를 이용한 클러스터링 알고리즘)

  • Seo, Young-Won;Kim, Chang-soo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.23-30
    • /
    • 2015
  • As BigData is issued, many applications that operate based on the results of data analysis have been developed, typically applications are products recommend service of e-commerce application service system, search service on the search engine service and friend list recommend system of social network service. In this paper, we suggests a decision frequent pattern tree that is combined the origin frequent pattern tree that is mining similar pattern to appear in the data set of the existing data mining techniques and decision tree based on the theory of computer science. The decision frequent pattern tree algorithm improves about problem of frequent pattern tree that have to make some a lot's pattern so it is to hard to analyze about data. We also proposes to model for a Mapredue framework that is a programming model to help to operate in distributed environment.

Independent Component Analysis for Clustering Analysis Components by Using Kurtosis (첨도에 의한 분석성분의 군집성을 고려한 독립성분분석)

  • Cho, Yong-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.429-436
    • /
    • 2004
  • This paper proposes an independent component analyses(ICAs) of the fixed-point (FP) algorithm based on Newton and secant method by adding the kurtosis, respectively. The kurtosis is applied to cluster the analyzed components, and the FP algorithm is applied to get the fast analysis and superior performance irrelevant to learning parameters. The proposed ICAs have been applied to the problems for separating the 6-mixed signals of 500 samples and 10-mixed images of $512\times512$ pixels, respectively. The experimental results show that the proposed ICAs have always a fixed analysis sequence. The results can be solved the limit of conventional ICA without a kurtosis which has a variable sequence depending on the running of algorithm. Especially. the proposed ICA can be used for classifying and identifying the signals or the images. The results also show that the secant method has better the separation speed and performance than Newton method. And, the secant method gives relatively larger improvement degree as the problem size increases.

A New Fast EM Algorithm (새로운 고속 EM 알고리즘)

  • 김성수;강지혜
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.10
    • /
    • pp.575-587
    • /
    • 2004
  • In this paper. a new Fast Expectation-Maximization algorithm(FEM) is proposed. Firstly the K-means algorithm is modified to reduce the number of iterations for finding the initial values that are used as the initial values in EM process. Conventionally the Initial values in K-means clustering are chosen randomly. which sometimes forces the process of clustering converge to some undesired center points. Uniform partitioning method is added to the conventional K-means to extract the proper initial points for each clusters. Secondly the effect of posterior probability is emphasized such that the application of Maximum Likelihood Posterior(MLP) yields fast convergence. The proposed FEM strengthens the characteristics of conventional EM by reinforcing the speed of convergence. The superiority of FEM is demonstrated in experimental results by presenting the improvement results of EM and accelerating the speed of convergence in parameter estimation procedures.

Underdetermined Blind Source Separation from Time-delayed Mixtures Based on Prior Information Exploitation

  • Zhang, Liangjun;Yang, Jie;Guo, Zhiqiang;Zhou, Yanwei
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.5
    • /
    • pp.2179-2188
    • /
    • 2015
  • Recently, many researches have been done to solve the challenging problem of Blind Source Separation (BSS) problems in the underdetermined cases, and the “Two-step” method is widely used, which estimates the mixing matrix first and then extracts the sources. To estimate the mixing matrix, conventional algorithms such as Single-Source-Points (SSPs) detection only exploits the sparsity of original signals. This paper proposes a new underdetermined mixing matrix estimation method for time-delayed mixtures based on the receiver prior exploitation. The prior information is extracted from the specific structure of the complex-valued mixing matrix, which is used to derive a special criterion to determine the SSPs. Moreover, after selecting the SSPs, Agglomerative Hierarchical Clustering (AHC) is used to automaticly cluster, suppress, and estimate all the elements of mixing matrix. Finally, a convex-model based subspace method is applied for signal separation. Simulation results show that the proposed algorithm can estimate the mixing matrix and extract the original source signals with higher accuracy especially in low SNR environments, and does not need the number of sources before hand, which is more reliable in the real non-cooperative environment.

Assessing the Impact of Advanced Technologies on Utilization Improvement of Substations

  • Han, Dong;Yan, Zheng;Zhang, Dao-Tian;Song, Yi-Qun
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.5
    • /
    • pp.1921-1929
    • /
    • 2015
  • The smart substation is the heart of a transmission system, which is particularly emphasized as the most significant composition of smart grids in China. In order to assess the functionality performance of substation technologies, this paper presents methods used to identify the most promising solutions for smart substation design and to evaluate the technical levels of available technologies. The multi-index optimization model is presented to address the issue of smart substation planning. A mathematical model of the planning decision problem is established with multiple objectives consisting of economic, reliability, and green key indices, and many kinds of concerns including physical and environmentally friendly operations are formulated as a set of constraints. With respect to the assessment of the technical level regarding integration of advanced technologies into a substation, a modified grey whitenization weight function is adopted to structure a novel grey clustering method. The proposed grey clustering approach is used to overcome the difficulty of insufficient quantitative assessment capacity for traditional methods. The evaluation of technical effects provides the classification definition for the development phase and the maturity level of the smart substation. The effectiveness of the proposed approaches in planning decision-making and evaluation of construction efforts is demonstrated with case studies involving the actual smart substation projects of Wenchongkou substation in China Southern Power Grid (CSG) and Mengzi substation in State Grid Corporation of China (SGCC).

Uncertainty for Privacy and 2-Dimensional Range Query Distortion

  • Sioutas, Spyros;Magkos, Emmanouil;Karydis, Ioannis;Verykios, Vassilios S.
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.3
    • /
    • pp.210-222
    • /
    • 2011
  • In this work, we study the problem of privacy-preservation data publishing in moving objects databases. In particular, the trajectory of a mobile user in a plane is no longer a polyline in a two-dimensional space, instead it is a two-dimensional surface of fixed width $2A_{min}$, where $A_{min}$ defines the semi-diameter of the minimum spatial circular extent that must replace the real location of the mobile user on the XY-plane, in the anonymized (kNN) request. The desired anonymity is not achieved and the entire system becomes vulnerable to attackers, since a malicious attacker can observe that during the time, many of the neighbors' ids change, except for a small number of users. Thus, we reinforce the privacy model by clustering the mobile users according to their motion patterns in (u, ${\theta}$) plane, where u and ${\theta}$ define the velocity measure and the motion direction (angle) respectively. In this case, the anonymized (kNN) request looks up neighbors, who belong to the same cluster with the mobile requester in (u, ${\theta}$) space: Thus, we know that the trajectory of the k-anonymous mobile user is within this surface, but we do not know exactly where. We transform the surface's boundary poly-lines to dual points and we focus on the information distortion introduced by this space translation. We develop a set of efficient spatiotemporal access methods and we experimentally measure the impact of information distortion by comparing the performance results of the same spatiotemporal range queries executed on the original database and on the anonymized one.

Mobile Automatic Conversion System using MLP (다층신경망을 이용한 모바일 자동 변환 시스템)

  • Han, Eun-Jung;Jang, Chang-Hyuk;Jung, Kee-Chul
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.2
    • /
    • pp.272-280
    • /
    • 2009
  • The recent mobile industry is providing of a lot of image on/off-line contents are being converted into the mobile contents for architectural design. However, it is difficult to provide users with the existing on/off-line contents without any considerations due to the small size of the mobile screen. In existing methods to overcome the problem, the comic contents on mobile devices are manually produced by computer software such as Photoshop. In this paper, I describe the Automatic Comics Conversion(ACC) system that provides the variedly form of offline comic contents into mobile device of the small screen using Multi-Layer Perceptorn(MLP). ACC produces an experience together with the comic contents fitting for the small screen, which introduces a clustering method that is useful for variety types of comic images and characters as a prerequisite as a stage for preserving semantic meaning. An application is to use the frame form of pictures, website and images in order into mobile device the availability and can bounce back the freeze images contents into dynamic images content.

  • PDF

Classification of Regional Innovation Types and Region-based Innovation Policies (지역별 혁신형태 유형화와 지역 기반 혁신 정책)

  • Yoo, Gwangmin;Kim, Dongkwan;Han, Seongho
    • Journal of Korea Technology Innovation Society
    • /
    • v.18 no.1
    • /
    • pp.151-175
    • /
    • 2015
  • The focus of regional innovation policies is shifting from a central government to local governments. No one denies the fact that the innovation will lead regional development and shall be created in such a way that it will be appropriate for regional circumstances. However, the central government and local governments have not arrived yet at a conclusion on what innovation policies are appropriate for regional circumstances. This leads to a consequence that is inefficient not only at a national level, but also at a regional level. This research, given this problem, aims to find out the characteristics and differences in innovation types among the regions in Korea and suggests appropriate policy implications by classifying such characteristics and differences. This research, given these objectives, classified regions in consideration of the various indicators that comprise the innovation suggested by existing related researches and illustrated policies based on such characteristics and differences. In this research clustering analysis based on multiple factor analysis was applied. Supplementary researches on dynamically analyzing stability in regional innovation types, establishing systematic indicators based on the regional innovation theory, and developing additional indicators are necessary in the future.

Cluster Analysis on the Management Performance of Major Shipping Companies in the World (세계 주요선사의 경영성과에 대한 군집분석)

  • Do, Thi Minh Hoang;Choi, Kyoung Hoon;Park, Gyei Kark
    • Journal of Korea Port Economic Association
    • /
    • v.33 no.4
    • /
    • pp.17-36
    • /
    • 2017
  • In the modern economic context, it is apparent that there is a strong focus on the importance of global shipping industry. Recently, the world economic crisis has negatively influenced the industry with regard to both supply and demand, which has seen almost no sign of recovery. The fact that the entire industry is operating with low efficiency and at a low profit state has made all stakeholders anxious. This research examines the financial performance of the world's major shipping lines in order to give maritime stakeholders a closer look into the industry behind the ranking. Besides, the research evaluates the competitiveness of shipping companies in terms of financial ability and suggestions for strategic actions to stakeholders are provided. For these purposes, Fuzzy-C Means is used to cluster the selected lines into different groups based on their financial indices, namely liquidity, asset management, debt management and profitability. Levene's tests which are then followed by ANOVA tests are also utilized to assess the robustness of the clustering outcomes. The results indicate that liquidity, solvency and profitability act as the main criteria in the classification problem.

2D-THI: Two-Dimensional Type Hierarchy Index for XML Databases (2D-THI: XML 데이테베이스를 위한 이차원 타입상속 계층색인)

  • Lee Jong-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.3
    • /
    • pp.265-278
    • /
    • 2006
  • This paper presents a two-dimensional type inheritance hierarchy index(2D-THI) for XML databases. XML Schema is one of schema models for the XML documents supporting. The type inheritance. The conventional indexing techniques for XML databases can not support XML queries on type inheritance hierarchies. We construct a two-dimensional index structure using multidimensional file organizations for supporting type inheritance hierarchy in XML queries. This indexing technique deals with the problem of clustering index entries in the two-dimensional domain space that consists of a key element domain and a type identifier domain based on the user query pattern. This index enhances query performance by adjusting the degree of clustering between the two domains. For performance evaluation, we have compared our proposed 2D-THI with the conventional class hierarchy indexing techniques in object-oriented databases such as CH-index and CG-tree through the cost model. As the result of the performance evaluations, we have verified that our proposed two-dimensional type inheritance indexing technique can efficiently support the query Processing in XML databases according to the query types.

  • PDF