• Title/Summary/Keyword: Clustering Problem

Search Result 709, Processing Time 0.032 seconds

A Hybrid Recommendation Method based on Attributes of Items and Ratings (항목 속성과 평가 정보를 이용한 혼합 추천 방법)

  • Kim Byeong Man;Li Qing
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1672-1683
    • /
    • 2004
  • Recommender system is a kind of web intelligence techniques to make a daily information filtering for people. Researchers have developed collaborative recommenders (social recommenders), content-based recommenders, and some hybrid systems. In this paper, we introduce a new hybrid recommender method - ICHM where clustering techniques have been applied to the item-based collaborative filtering framework. It provides a way to integrate the content information into the collaborative filtering, which contributes to not only reducing the sparsity of data set but also solving the cold start problem. Extensive experiments have been conducted on MovieLense data to analyze the characteristics of our technique. The results show that our approach contributes to the improvement of prediction quality of the item-based collaborative filtering, especially for the cold start problem.

Nonlinear System Modeling Using Genetic Algorithm and FCM-basd Fuzzy System (유전알고리즘과 FCM 기반 퍼지 시스템을 이용한 비선형 시스템 모델링)

  • 곽근창;이대종;유정웅;전명근
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.6
    • /
    • pp.491-499
    • /
    • 2001
  • In this paper, the scheme of an efficient fuzzy rule generation and fuzzy system construction using GA(genetic algorithm) and FCM(fuzzy c-means) clustering algorithm is proposed for TSK(Takagi-Sugeno-Kang) type fuzzy system. In the structure identification, input data is transformed by PCA(Principal Component Analysis) to reduce the correlation among input data components. And then, a set fuzzy rules are generated for a given criterion by FCM clustering algorithm . In the parameter identification premise parameters are optimally searched by GA. On the other hand, the consequent parameters are estimated by RLSE(Recursive Least Square Estimate) to reduce the search space. From this one can systematically obtain the valid number of fuzzy rules which shows satisfying performance for the given problem. Finally, we applied the proposed method to the Box-Jenkins data and rice taste data modeling problems and obtained a better performance than previous works.

  • PDF

CUCE: clustering protocol using node connectivity and node energy (노드 연결도와 에너지 정보를 이용한 개선된 센서네트워크 클러스터링 프로토콜)

  • Choi, Hae-Won
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.4
    • /
    • pp.41-50
    • /
    • 2012
  • Network life time is very important issue for wireless sensor network(WSN). It is very important to design sensor networks for sensors to utilize their energies in effective ways. A-PEGASIS that basically bases on PEGASIS and enhances in two aspects-an elegant chain generation algorithm and periodical update of chains. However, it has problems in the chain generation mechanism and some possibility of network partitioning or sensing hole problem in the network, in LEACH related protocols. This dissertation proposes a new clustering protocol to solve the co-shared problems in the previous protocols. The basic idea of our scheme is using the table for node connectivity. The results show that the network life time would be extended in about 1.8 times longer than LEACH and 1.5 times longer than PEGASIS-A.

Imputation method for missing data based on clustering and measure of property (군집화 및 특성도를 이용한 결측치 대체 방법)

  • Kim, Sunghyun;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.29-40
    • /
    • 2018
  • There are various reasons for missing values when collecting data. Missing values have some influence on the analysis and results; consequently, various methods of processing missing values have been studied to solve the problem. It is thought that the later point of view may be affected by the initial time point value in the repeated measurement data. However, in the existing method, there was no method for the imputation of missing values using this concept. Therefore, we proposed a new missing value imputation method in this study using clustering in initial time point of the repeated measurement data and the measure of property proposed by Kim and Kim (The Korean Communications in Statistics, 30, 463-473, 2017). We also applied the Monte Carlo simulations to compare the performance of the established method and suggested methods in repeated measurement data.

Two-stage Sampling for Estimation of Prevalence of Bovine Tuberculosis (이단계표본추출을 이용한 소결핵병 유병률 추정)

  • Pak, Son-Il
    • Journal of Veterinary Clinics
    • /
    • v.28 no.4
    • /
    • pp.422-426
    • /
    • 2011
  • For a national survey in which wide geographic region or an entire country is targeted, multi-stage sampling approach is widely used to overcome the problem of simple random sampling, to consider both herd- and animallevel factors associated with disease occurrence, and to adjust clustering effect of disease in the population in the calculation of sample size. The aim of this study was to establish sample size for estimating bovine tuberculosis (TB) in Korea using stratified two-stage sampling design. The sample size was determined by taking into account the possible clustering of TB-infected animals on individual herds to increase the reliability of survey results. In this study, the country was stratified into nine provinces (administrative unit) and herd, the primary sampling unit, was considered as a cluster. For all analyses, design effect of 2, between-cluster prevalence of 50% to yield maximum sample size, and mean herd size of 65 were assumed due to lack of information available. Using a two-stage sampling scheme, the number of cattle sampled per herd was 65 cattle, regardless of confidence level, prevalence, and mean herd size examined. Number of clusters to be sampled at a 95% level of confidence was estimated to be 296, 74, 33, 19, 12, and 9 for desired precision of 0.01, 0.02, 0.03, 0.04, 0.05, and 0.06, respectively. Therefore, the total sample size with a 95% confidence level was 172,872, 43,218, 19,224, 10,818, 6,930, and 4,806 for desired precision ranging from 0.01 to 0.06. The sample size was increased with desired precision and design effect. In a situation where the number of cattle sampled per herd is fixed ranging from 5 to 40 with a 5-head interval, total sample size with a 95% confidence level was estimated to be 6,480, 10,080, 13,770, 17,280, 20.925, 24,570, 28,350, and 31,680, respectively. The percent increase in total sample size resulting from the use of intra-cluster correlation coefficient of 0.3 was 22.2, 32.1, 36.3, 39.6, 41.9, 42.9, 42,2, and 44.3%, respectively in comparison to the use of coefficient of 0.2.

Character Extraction from Color Map Image Using Interactive Clustering (대화식 클러스터링 기법을 이용한 칼라 지도의 문자 영역 추출에 관한 연구)

  • Ahn, Chang;Park, Chan-Jung;Rhee, Sang-Burm
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.1
    • /
    • pp.270-279
    • /
    • 1997
  • The conversion of printed maps into computerized databases is an enormous task. Thus the automation of the conversion process is essential. Efficient computer representation of printed maps and line drawings depends on codes assigned to characters, symbols, and vector representation of the graphics. In many cases, maps are constructed in a number of layers, where each layer is printed in a distinct color, and it represents a subset of the map information. In order to properly represent the character layer from color map images, an interactive clustering and character extraction technique is proposed. Character is usually separated from graphics by extracting and classifying connected components in the image. But this procedure fails, when characters touch or overlap lines-something that occurs often in land register maps. By vectorizing line segments, the touched characters and numbers are extracted. The algorithm proposed in this paper is intended to contribute towards the solution of the color image clustering and touched character problem.

  • PDF

Distributed Computing Models for Wireless Sensor Networks (무선 센서 네트워크에서의 분산 컴퓨팅 모델)

  • Park, Chongmyung;Lee, Chungsan;Jo, Youngtae;Jung, Inbum
    • Journal of KIISE
    • /
    • v.41 no.11
    • /
    • pp.958-966
    • /
    • 2014
  • Wireless sensor networks offer a distributed processing environment. Many sensor nodes are deployed in fields that have limited resources such as computing power, network bandwidth, and electric power. The sensor nodes construct their own networks automatically, and the collected data are sent to the sink node. In these traditional wireless sensor networks, network congestion due to packet flooding through the networks shortens the network life time. Clustering or in-network technologies help reduce packet flooding in the networks. Many studies have been focused on saving energy in the sensor nodes because the limited available power leads to an important problem of extending the operation of sensor networks as long as possible. However, we focus on the execution time because clustering and local distributed processing already contribute to saving energy by local decision-making. In this paper, we present a cooperative processing model based on the processing timeline. Our processing model includes validation of the processing, prediction of the total execution time, and determination of the optimal number of processing nodes for distributed processing in wireless sensor networks. The experiments demonstrate the accuracy of the proposed model, and a case study shows that our model can be used for the distributed application.

Realignment of Clients in Client-server Database System (클라이언트-서버 데이터베이스에서 의 온라인 클라이언트 재배치)

  • Park, Young-B.;Park, J.
    • The KIPS Transactions:PartD
    • /
    • v.10D no.4
    • /
    • pp.639-646
    • /
    • 2003
  • Conventional two-tier databases have shown performance limitation in the presence of many concurrent clients. To this end, the three-tier architecture that exploits similarities in client's object access behavior has been proposed. In this system, clients are partitioned into clusters, and object requests can be then served in inter-cluster manner. Introducing an intermediate layer between server(s) and clients enables this. In this paper, we introduce the problem of client realignment in which access behavior changes, and propose on-line client clustering. This system facilitates adaptive reconfiguration and redistribution of sites. The core issue in this paper is to demonstrate the effectiveness of on-line client clustering. We experimentally investigate the performance of the scheme and necessary costs.

Mobile Gesture Recognition using Dynamic Time Warping with Localized Template (지역화된 템플릿기반 동적 시간정합을 이용한 모바일 제스처인식)

  • Choe, Bong-Whan;Min, Jun-Ki;Jo, Seong-Bae
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.482-486
    • /
    • 2010
  • Recently, gesture recognition methods based on dynamic time warping (DTW) have been actively investigated as more mobile devices have equipped the accelerometer. DTW has no additional training step since it uses given samples as the matching templates. However, it is difficult to apply the DTW on mobile environments because of its computational complexity of matching step where the input pattern has to be compared with every templates. In order to address the problem, this paper proposes a gesture recognition method based on DTW that uses localized subset of templates. Here, the k-means clustering algorithm is used to divide each class into subclasses in which the most centered sample in each subclass is employed as the localized template. It increases the recognition speed by reducing the number of matches while it minimizes the errors by preserving the diversities of the training patterns. Experimental results showed that the proposed method was about five times faster than the DTW with all training samples, and more stable than the randomly selected templates.

Binary Visual Word Generation Techniques for A Fast Image Search (고속 이미지 검색을 위한 2진 시각 단어 생성 기법)

  • Lee, Suwon
    • Journal of KIISE
    • /
    • v.44 no.12
    • /
    • pp.1313-1318
    • /
    • 2017
  • Aggregating local features in a single vector is a fundamental problem in an image search. In this process, the image search process can be speeded up if binary features which are extracted almost two order of magnitude faster than gradient-based features are utilized. However, in order to utilize the binary features in an image search, it is necessary to study the techniques for clustering binary features to generate binary visual words. This investigation is necessary because traditional clustering techniques for gradient-based features are not compatible with binary features. To this end, this paper studies the techniques for clustering binary features for the purpose of generating binary visual words. Through experiments, we analyze the trade-off between the accuracy and computational efficiency of an image search using binary features, and we then compare the proposed techniques. This research is expected to be applied to mobile applications, real-time applications, and web scale applications that require a fast image search.