• Title/Summary/Keyword: Multiple clustering

Search Result 356, Processing Time 0.026 seconds

Movie Recommendation Using Co-Clustering by Infinite Relational Models (Infinite Relational Model 기반 Co-Clustering을 이용한 영화 추천)

  • Kim, Byoung-Hee;Zhang, Byoung-Tak
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.4
    • /
    • pp.443-449
    • /
    • 2014
  • Preferences of users on movies are observables of various factors that are related with user attributes and movie features. For movie recommendation, analysis methods for relation among users, movies, and preference patterns are mandatory. As a relational analysis tool, we focus on the Infinite Relational Model (IRM) which was introduced as a tool for multiple concept search. We show that IRM-based co-clustering on preference patterns and movie descriptors can be used as the first tool for movie recommender methods, especially content-based filtering approaches. By introducing a set of well-defined tag sets for movies and doing three-way co-clustering on a movie-rating matrix and a movie-tag matrix, we discovered various explainable relations among users and movies. We suggest various usages of IRM-based co-clustering, espcially, for incremental and dynamic recommender systems.

Modeling Planned Maintenance Outage of Generators Based on Advanced Demand Clustering Algorithms (개선된 수요 클러스터링 기법을 이용한 발전기 보수정지계획 모델링)

  • Kim, Jin-Ho;Park, Jong-Bae
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.55 no.4
    • /
    • pp.172-178
    • /
    • 2006
  • In this paper, an advanced demand clustering algorithm which can explore the planned maintenance outage of generators in changed electricity industry is proposed. The major contribution of this paper can be captured in the development of the long-term estimates for the generation availability considering planned maintenance outage. Two conflicting viewpoints, one of which is reliability-focused and the other is economy-focused, are incorporated in the development of estimates of maintenance outage based on the advanced demand clustering algorithm. Based on the advanced clustering algorithm, in each demand cluster, conventional effective outage of generators which conceptually capture maintenance and forced outage of generators, are newly defined in order to properly address the characteristic of the planned maintenance outage in changed electricity markets. First, initial market demand is classified into multiple demand clusters, which are defined by the effective outage rates of generators and by the inherent characteristic of the initial demand. Then, based on the advanced demand clustering algorithm, the planned maintenance outages and corresponding effective outages of generators are reevaluated. Finally, the conventional demand clusters are newly classified in order to reflect the improved effective outages of generation markets. We have found that the revision of the demand clusters can change the number of the initial demand clusters, which cannot be captured in the conventional demand clustering process. Therefore, it can be seen that electricity market situations, which can also be classified into several groups which show similar patterns, can be more accurately clustered. From this the fundamental characteristics of power systems can be more efficiently analyzed, for this advanced classification can be widely applicable to other technical problems in power systems such as generation scheduling, power flow analysis, price forecasts, and so on.

A Study on Cluster Head Selection Based on Distance from Sensor to Base Station in Wireless Sensor Network (무선센서 네트워크에서 센서와 기지국과의 거리를 고려한 클러스터 헤드 선택기법)

  • Ko, Sung-Won;Cho, Jeong-Hwan
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.27 no.10
    • /
    • pp.50-58
    • /
    • 2013
  • In Wireless Sensor Network, clustering scheme is used to prolong the lifetime of WSN by efficient usage of energy of sensor. In the distributed clustering protocol just like LEACH, every sensor in a network plays a cluster head role once during each epoch. So the FND is prolonged. But, even though every sensor plays a head role, the energy consumed by each sensor is different because the energy consumed increases according to the distance to the Base Station by the way of multiple increase. In this paper, we propose a mechanism to select a head depending on the distance to Base Station, which extends the timing of FND occurrence by 68% compared to the LEACH and makes network stable.

A Study of Association Rule Mining by Clustering through Data Fusion

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.927-935
    • /
    • 2007
  • Currently, Gyeongnam province is executing the social index survey every year to the provincials. But, this survey has the limit of the analysis as execution of the different survey per 3 year cycles. The solution of this problem is data fusion. Data fusion is the process of combining multiple data in order to provide information of tactical value to the user. But, data fusion doesn#t mean the ultimate result. Therefore, efficient analysis for the data fusion is also important. In this study, we present data fusion method of statistical survey data. Also, we suggest application methodology of association rule mining by clustering through data fusion of statistical survey data.

  • PDF

Clustering Parts Based on the Design and Manufacturing Similarities Using a Genetic Algorithm

  • Lee, Sung-Youl
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.16 no.4
    • /
    • pp.119-125
    • /
    • 2011
  • The part family (PF) formation in a cellular manufacturing has been a key issue for the successful implementation of Group Technology (GT). Basically, a part has two different attributes; i.e., design and manufacturing. The respective similarity in both attributes is often conflicting each other. However, the two attributes should be taken into account appropriately in order for the PF to maximize the benefits of the GT implementation. This paper proposes a clustering algorithm which considers the two attributes simultaneously based on pareto optimal theory. The similarity in each attribute can be represented as two individual objective functions. Then, the resulting two objective functions are properly combined into a pareto fitness function which assigns a single fitness value to each solution based on the two objective functions. A GA is used to find the pareto optimal set of solutions based on the fitness function. A set of hypothetical parts are grouped using the proposed system. The results show that the proposed system is very promising in clustering with multiple objectives.

Clustering with Adaptive weighting of Context-aware Linear regression (상황인식기반 선형회귀의 적응적 가중치를 적용한 클러스터링)

  • Lee, Kang-whan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.271-273
    • /
    • 2021
  • 본 논문은 이동노드의 클러스터링내에서 보다 효율적인클러스터링을 제공하고 유지하기위한 딥러닝의 선형회귀적 적응적 보정가중치에 따른 군집적 알고리즘을 제안한다. 대부분의 클러스터링 군집데이터를 처리함에 있어 상호관계에 따른 분류체계가 제공된다. 이러한 경우 이웃한 이동노드중 목적노드와는 연결가능성이 가장높은 이동노드를 클러스터내에서 중계노드로 선택해야 한다. 본 연구에서는 이러한 상황정보를 이해하고 동적이동노드간 속도와 방향속성정보간의 상관관계의 친밀도를 고려한 자율학습기반의 회귀적 모델에서 적응적 가중치에 따른 분류를 제시한다. 본 논문에서는 이러한 상황정보를 이해하고 클러스터링을 유지할 수 있는 자율학습기반의 적응적 가중치에 따른 딥러닝 모델을 제시 한다.

  • PDF

Retail Outlet Clustering of the Imported Automobile Distributors in Korea

  • Park, Koo-Woong
    • Journal of Distribution Science
    • /
    • v.16 no.5
    • /
    • pp.45-59
    • /
    • 2018
  • Purpose - This paper aims to analyze the distinct pattern of clustering of imported automobile distributors and provide evidence for the phenomenon using Korean data. Research design, data, and methodology - In this paper, we use data from Korea Automobile Importers & Distributors Association of 23 foreign automobile brands to evaluate the degree of concentration of showrooms using locational Gini index. We identify possible causes for the high level of clustering from two perspectives; 1) on the distributors' side and 2) on the customers' side. Results - We find a very strong locational concentration of imported automobile showrooms within close vicinity in the major cities and districts in Korea. Locational Gini coefficients are 0.1024 at the national level, 0.1836~0.3763 at city level, and 0.3941~0.4311 at district level on a [0,0.5] scale. Conclusions - Luxury foreign automobile customers tend to shop extensively around multiple brands prior to their ideal model selection. Accordingly, the imported automobile distributors cluster together close to their direct competitors in order to give a good comparison opportunity for the potential customers. This will maximize the probability of the visits of potential customers and lead to successful sales performance.

Unsupervised Clustering of Multivariate Time Series Microarray Experiments based on Incremental Non-Gaussian Analysis

  • Ng, Kam Swee;Yang, Hyung-Jeong;Kim, Soo-Hyung;Kim, Sun-Hee;Anh, Nguyen Thi Ngoc
    • International Journal of Contents
    • /
    • v.8 no.1
    • /
    • pp.23-29
    • /
    • 2012
  • Multiple expression levels of genes obtained using time series microarray experiments have been exploited effectively to enhance understanding of a wide range of biological phenomena. However, the unique nature of microarray data is usually in the form of large matrices of expression genes with high dimensions. Among the huge number of genes presented in microarrays, only a small number of genes are expected to be effective for performing a certain task. Hence, discounting the majority of unaffected genes is the crucial goal of gene selection to improve accuracy for disease diagnosis. In this paper, a non-Gaussian weight matrix obtained from an incremental model is proposed to extract useful features of multivariate time series microarrays. The proposed method can automatically identify a small number of significant features via discovering hidden variables from a huge number of features. An unsupervised hierarchical clustering representative is then taken to evaluate the effectiveness of the proposed methodology. The proposed method achieves promising results based on predictive accuracy of clustering compared to existing methods of analysis. Furthermore, the proposed method offers a robust approach with low memory and computation costs.

3D Radar Objects Tracking and Reflectivity Profiling

  • Kim, Yong Hyun;Lee, Hansoo;Kim, Sungshin
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.4
    • /
    • pp.263-269
    • /
    • 2012
  • The ability to characterize feature objects from radar readings is often limited by simply looking at their still frame reflectivity, differential reflectivity and differential phase data. In many cases, time-series study of these objects' reflectivity profile is required to properly characterize features objects of interest. This paper introduces a novel technique to automatically track multiple 3D radar structures in C,S-band in real-time using Doppler radar and profile their characteristic reflectivity distribution in time series. The extraction of reflectivity profile from different radar cluster structures is done in three stages: 1. static frame (zone-linkage) clustering, 2. dynamic frame (evolution-linkage) clustering and 3. characterization of clusters through time series profile of reflectivity distribution. The two clustering schemes proposed here are applied on composite multi-layers CAPPI (Constant Altitude Plan Position Indicator) radar data which covers altitude range of 0.25 to 10 km and an area spanning over hundreds of thousands $km^2$. Discrete numerical simulations show the validity of the proposed technique and that fast and accurate profiling of time series reflectivity distribution for deformable 3D radar structures is achievable.

Runtime Prediction Based on Workload-Aware Clustering (병렬 프로그램 로그 군집화 기반 작업 실행 시간 예측모형 연구)

  • Kim, Eunhye;Park, Ju-Won
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.38 no.3
    • /
    • pp.56-63
    • /
    • 2015
  • Several fields of science have demanded large-scale workflow support, which requires thousands of CPU cores or more. In order to support such large-scale scientific workflows, high capacity parallel systems such as supercomputers are widely used. In order to increase the utilization of these systems, most schedulers use backfilling policy: Small jobs are moved ahead to fill in holes in the schedule when large jobs do not delay. Since an estimate of the runtime is necessary for backfilling, most parallel systems use user's estimated runtime. However, it is found to be extremely inaccurate because users overestimate their jobs. Therefore, in this paper, we propose a novel system for the runtime prediction based on workload-aware clustering with the goal of improving prediction performance. The proposed method for runtime prediction of parallel applications consists of three main phases. First, a feature selection based on factor analysis is performed to identify important input features. Then, it performs a clustering analysis of history data based on self-organizing map which is followed by hierarchical clustering for finding the clustering boundaries from the weight vectors. Finally, prediction models are constructed using support vector regression with the clustered workload data. Multiple prediction models for each clustered data pattern can reduce the error rate compared with a single model for the whole data pattern. In the experiments, we use workload logs on parallel systems (i.e., iPSC, LANL-CM5, SDSC-Par95, SDSC-Par96, and CTC-SP2) to evaluate the effectiveness of our approach. Comparing with other techniques, experimental results show that the proposed method improves the accuracy up to 69.08%.