• Title/Summary/Keyword: 클러스터기반 기법

Search Result 534, Processing Time 0.025 seconds

An Enhanced Density and Grid based Spatial Clustering Algorithm for Large Spatial Database (대용량 공간데이터베이스를 위한 확장된 밀도-격자 기반의 공간 클러스터링 알고리즘)

  • Gao, Song;Kim, Ho-Seok;Xia, Ying;Kim, Gyoung-Bae;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.633-640
    • /
    • 2006
  • Spatial clustering, which groups similar objects based on their distance, connectivity, or their relative density in space, is an important component of spatial data mining. Density-based and grid-based clustering are two main clustering approaches. The former is famous for its capability of discovering clusters of various shapes and eliminating noises, while the latter is well known for its high speed. Clustering large data sets has always been a serious challenge for clustering algorithms, because huge data set would make the clustering process extremely costly. In this paper, we propose an enhanced Density-Grid based Clustering algorithm for Large spatial database by setting a default number of intervals and removing the outliers effectively with the help of a proper measurement to identify areas of high density in the input data space. We use a density threshold DT to recognize dense cells before neighbor dense cells are combined to form clusters. When proposed algorithm is performed on large dataset, a proper granularity of each dimension in data space and a density threshold for recognizing dense areas can improve the performance of this algorithm. We combine grid-based and density-based methods together to not only increase the efficiency but also find clusters with arbitrary shape. Synthetic datasets are used for experimental evaluation which shows that proposed method has high performance and accuracy in the experiments.

Design and Implementation of the Extended SLDS for Real-time Location Based Services (실시간 위치 기반 서비스를 위한 확장 SLDS 설계 및 구현)

  • Lee, Seung-Won;Kang, Hong-Koo;Hong, Dong-Suk;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.7 no.2 s.14
    • /
    • pp.47-56
    • /
    • 2005
  • Recently, with the rapid development of mobile computing, wireless positioning technologies, and the generalization of wireless internet, LBS (Location Based Service) which utilizes location information of moving objects is serving in many fields. In order to serve LBS efficiently, the location data server that periodically stores location data of moving objects is required. Formerly, GIS servers have been used to store location data of moving objects. However, GIS servers are not suitable to store location data of moving objects because it was designed to store static data. Therefore, in this paper, we designed and implemented an extended SLDS(Short-term Location Data Subsystem) for real-time Location Based Services. The extended SLDS is extended from the SLDS which is a subsystem of the GALIS(Gracefully Aging Location Information System) architecture that was proposed as a cluster-based distributed computing system architecture for managing location data of moving objects. The extended SLDS guarantees real-time service capabilities using the TMO(Time-triggered Message-triggered Object) programming scheme and efficiently manages large volume of location data through distributing moving object data over multiple nodes. The extended SLDS also has a little search and update overhead because of managing location data in main memory. In addition, we proved that the extended SLDS stores location data and performs load distribution more efficiently than the original SLDS through the performance evaluation.

  • PDF

A Spatial Statistical Approach to Residential Differentiation (II): Exploratory Spatial Data Analysis Using a Local Spatial Separation Measure (거주지 분화에 대한 공간통계학적 접근 (II): 국지적 공간 분리성 측도를 이용한 탐색적 공간데이터 분석)

  • Lee, Sang-Il
    • Journal of the Korean Geographical Society
    • /
    • v.43 no.1
    • /
    • pp.134-153
    • /
    • 2008
  • The main purpose of the research is to illustrate the value of the spatial statistical approach to residential differentiation by providing a framework for exploratory spatial data analysis (ESDA) using a local spatial separation measure. ESDA aims, by utilizing a variety of statistical and cartographic visualization techniques, at seeking to detect patterns, to formulate hypotheses, and to assess statistical models for spatial data. The research is driven by a realization that ESDA based on local statistics has a great potential for substantive research. The main results are as follows. First, a local spatial separation measure is correspondingly derived from its global counterpart. Second, a set of significance testing methods based on both total and conditional randomization assumptions is provided for the local measure. Third, two mapping techniques, a 'spatial separation scatterplot map' and a 'spatial separation anomaly map', are devised for ESDA utilizing the local measure and the related significance tests. Fourth, a case study of residential differentiation between the highly educated and the least educated in major Korean metropolitan cities shows that the proposed ESDA techniques are beneficial in identifying bivariate spatial clusters and spatial outliers.

A Novel of Data Clustering Architecture for Outlier Detection to Electric Power Data Analysis (전력데이터 분석에서 이상점 추출을 위한 데이터 클러스터링 아키텍처에 관한 연구)

  • Jung, Se Hoon;Shin, Chang Sun;Cho, Young Yun;Park, Jang Woo;Park, Myung Hye;Kim, Young Hyun;Lee, Seung Bae;Sim, Chun Bo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.10
    • /
    • pp.465-472
    • /
    • 2017
  • In the past, researchers mainly used the supervised learning technique of machine learning to analyze power data and investigated the identification of patterns through the data mining technique. Data analysis research, however, faces its limitations with the old data classification and analysis techniques today when the size of electric power data has increased with the possible real-time provision of data. This study thus set out to propose a clustering architecture to analyze large-sized electric power data. The clustering process proposed in the study supplements the K-means algorithm, an unsupervised learning technique, for its problems and is capable of automating the entire process from the collection of electric power data to their analysis. In the present study, power data were categorized and analyzed in total three levels, which include the row data level, clustering level, and user interface level. In addition, the investigator identified K, the ideal number of clusters, based on principal component analysis and normal distribution and proposed an altered K-means algorithm to reduce data that would be categorized as ideal points in order to increase the efficiency of clustering.

Verifying Execution Prediction Model based on Learning Algorithm for Real-time Monitoring (실시간 감시를 위한 학습기반 수행 예측모델의 검증)

  • Jeong, Yoon-Seok;Kim, Tae-Wan;Chang, Chun-Hyon
    • The KIPS Transactions:PartA
    • /
    • v.11A no.4
    • /
    • pp.243-250
    • /
    • 2004
  • Monitoring is used to see if a real-time system provides a service on time. Generally, monitoring for real-time focuses on investigating the current status of a real-time system. To support a stable performance of a real-time system, it should have not only a function to see the current status of real-time process but also a function to predict executions of real-time processes, however. The legacy prediction model has some limitation to apply it to a real-time monitoring. First, it performs a static prediction after a real-time process finished. Second, it needs a statistical pre-analysis before a prediction. Third, transition probability and data about clustering is not based on the current data. We propose the execution prediction model based on learning algorithm to solve these problems and apply it to real-time monitoring. This model gets rid of unnecessary pre-processing and supports a precise prediction based on current data. In addition, this supports multi-level prediction by a trend analysis of past execution data. Most of all, We designed the model to support dynamic prediction which is performed within a real-time process' execution. The results from some experiments show that the judgment accuracy is greater than 80% if the size of a training set is set to over 10, and, in the case of the multi-level prediction, that the prediction difference of the multi-level prediction is minimized if the number of execution is bigger than the size of a training set. The execution prediction model proposed in this model has some limitation that the model used the most simplest learning algorithm and that it didn't consider the multi-regional space model managing CPU, memory and I/O data. The execution prediction model based on a learning algorithm proposed in this paper is used in some areas related to real-time monitoring and control.

Performance Comparison of Clustering using Discritization Algorithm (이산화 알고리즘을 이용한 계층적 클러스터링의 실험적 성능 평가)

  • Won, Jae Kang;Lee, Jeong Chan;Jung, Yong Gyu;Lee, Young Ho
    • Journal of Service Research and Studies
    • /
    • v.3 no.2
    • /
    • pp.53-60
    • /
    • 2013
  • Datamining from the large data in the form of various techniques for obtaining information have been developed. In recent years one of the most sought areas of pattern recognition and machine learning method is created with most of existing learning algorithms based on categorical attributes to a rule or decision model. However, the real-world data, it may consist of numeric attributes in many cases. In addition it contains attributes with numerical values to the normal categorical attribute. In this case, therefore, it is required processes in order to use the data to learn an appropriate value for the type attribute. In this paper, the domain of the numeric attributes are divided into several segments using learning algorithm techniques of discritization. It is described Clustering with other data mining techniques. Large amount of first cluster with characteristics is similar records from the database into smaller groups that split multiple given finite patterns in the pattern space. It is close to each other of a set of patterns that together make up a bunch. Among the set without specifying a particular category in a given data by extracting a pattern. It will be described similar grouping of data clustering technique to classify the data.

  • PDF

Implementation of the ZigBee-based Homenetwork security system using neighbor detection and ACL (이웃탐지와 ACL을 이용한 ZigBee 기반의 홈네트워크 보안 시스템 구현)

  • Park, Hyun-Moon;Park, Soo-Hyun;Seo, Hae-Moon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.1
    • /
    • pp.35-45
    • /
    • 2009
  • In an open environment such as Home Network, ZigBee Cluster comprising a plurality of Ato-cells is required to provide intense security over the movement of collected, measured data. Against this setting, various security issues are currently under discussion concerning master key control policies, Access Control List (ACL), and device sources, which all involve authentication between ZigBee devices. A variety of authentication methods including Hash Chain Method, token-key method, and public key infrastructure, have been previously studied, and some of them have been reflected in standard methods. In this context, this paper aims to explore whether a new method for searching for neighboring devices in order to detect device replications and Sybil attacks can be applied and extended to the field of security. The neighbor detection applied method is a method of authentication in which ACL information of new devices and that of neighbor devices are included and compared, using information on peripheral devices. Accordingly, this new method is designed to implement detection of malicious device attacks such as Sybil attacks and device replications as well as prevention of hacking. In addition, in reference to ITU-T SG17 and ZigBee Pro, the home network equipment, configured to classify the labels and rules into four categories including user's access rights, time, date, and day, is implemented. In closing, the results demonstrates that the proposed method performs significantly well compared to other existing methods in detecting malicious devices in terms of success rate and time taken.

A Study on the Characteristics of the Spatial Distribution of Sex Crimes: Spatial Analysis based on Environmental Criminology (성폭력 범죄의 공간적 분포 특성에 관한 연구: 환경범죄학에 기반한 공간 분석)

  • Lee, Gunhak;Jin, Chanwoo;Kim, Jiwoo;Kim, Wanhee
    • Journal of the Korean Geographical Society
    • /
    • v.51 no.6
    • /
    • pp.853-871
    • /
    • 2016
  • The interest in the prevention of sex crimes and social secure is growing as the number of cases by sexual offences becomes higher. Although various punishable ways have been introduced so far, increasing sex crime is still going on. Thus, effectiveness of legal systems for preventing crimes is questionable. More recently, the approach for environmental criminology has been paid attention for reducing criminal opportunities through environmental design and management of crimes. This study attempts to look over the spatial distribution of sexual crimes in the context of environmental criminology, and examine the correlation between regional environmental factors and the occurrence of sexual crimes empirically. To do this, we visualized the map for sex crimes at the macro-scale and explored the spatial distribution of sexual crimes and spatial clusters based on various spatial statistics using sex crime data published online by the ministry of gender equality and family. Also, we derived the environmental characteristics of sexual crimes by multivariate regression analysis on a large number of explanatory variables of regional environment. Our results will help to understand the current situation and spatial aspects of sex crimes in the nation more realistically. Further, it is respected that our results might be useful basic information for establishing regional policies and plans for the prevention of the sexual crime and enhanced public policing.

  • PDF

A Topic Analysis of Requested Books by User Types at a University Library for Patron-Driven Acquisition (이용자 요구 기반 장서개발을 위한 대학도서관 희망도서 주제 분석)

  • Sanghee Choi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.58 no.1
    • /
    • pp.395-415
    • /
    • 2024
  • In the development of a university library's collection, the concept of patron-driven acquisition refers to a collection strategy that addresses users' direct information needs. In this study, an analysis of ten years' worth of book requests by user types was conducted to understand the topic preferences for efficient collection devleopment in the university library. In collection development, identifying subject areas of users' requested books is necessary for librarians to identify key areas of collection development and establish balanced collection development policies. To identify the major subject areas for each user group, KDC (Korean Decimal Classification) subject classifications were used, and network analysis techniques were applied to investigate the relationships between book topics in detail. The analysis revealed that "social sciences" emerged as the major topic across all user groups. However, in the analysis of sub-topics, "medicine" and "psychology" were distinctively identified as the major subject areas for graduate students, setting them apart from other user groups. The result of the network analysis further indicated that undergraduate students showed unique topics such as civil service, job placement, and career, which were not observed as major topic clusters in other user groups. On the other hand, graduate students tended to concentrate on a few specialized subjects, forming distinct topic clusters in the analysis.

Cluster-based Geocasting Protocol in Ad-hoc Networks (애드 혹 네트워크에서 클러스터 기반 지오캐스팅 프로토콜)

  • Lee Jung-Hwan;Yoo Sang-Jo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.5A
    • /
    • pp.407-416
    • /
    • 2005
  • This paper suggests a new geocasting protocol which is used to transfer the geographic packets to the specific region in MANET. Geocasting protocol is basically different from the conventional multicasting protocol that needs group addition and maintenance. A geocasting protocol using the mobile node's position information is the new area of multicasting protocols. The existing geocasting protocols have the following problems; it may be impossible to transfer data to some mobile hosts even if there are alternate routes and they have low adaptability and efficiency when the number of mobile hosts increases. The proposed CBG (Cluster-Based Geocasting) uses the proactive routing strategy and clustering technique with mobile host's location information. The CBG achieves high successful data transmission ratio and low data delivery cost to mobile hosts at specific region.