• Title/Summary/Keyword: phase clustering

Search Result 131, Processing Time 0.024 seconds

A Study on Phase of Arrival Pattern using K-means Clustering Analysis (K-Means 클러스터링을 활용한 선박입항패턴 단계화 연구)

  • Lee, Jeong-Seok;Lee, Hyeong-Tak;Cho, Ik-Soon
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2020.11a
    • /
    • pp.54-55
    • /
    • 2020
  • In 4th Industrial Revolution, technologies such as artificial intelligence, Internet of Things, and Big data are closely related to the maritime industry, which led to the birth of autonomous vessels. Due to the technical characteristics of the current vessel, the speed cannot be suddenly lowered, so complex communication such as the help of a tug boat, boarding of a pilot, and control of the vessel at the onshore control center is required to berth at the port. In this study, clustering analysis was used to resolve how to establish control criteria for vessels to enter port when autonomous vessels are operating. K-Means clustering was used to quantitatively stage the arrival pattern based on the accumulated AIS(Automatic Identification System) data of the incoming vessel, and the arrival phase using SOG(Speed over Ground), COG(Course over Ground), and ROT(Rate of Turn) Was divided into six phase.

  • PDF

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

Determining the Number and the Locations of RBF Centers Using Enhanced K-Medoids Clustering and Bi-Section Search Method (보정된 K-medoids 군집화 기법과 이분 탐색기법을 이용한 RBF 네트워크의 중심 개수와 위치와 통합 결정)

  • Lee, Daewon;Lee, Jaewook
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.29 no.2
    • /
    • pp.172-178
    • /
    • 2003
  • In the recent researches, a variety of ways for determining the locations of RBF centers have been proposed assuming that the number of RBF centers is known. But they have also many numerical drawbacks. We propose a new method to overcome such drawbacks. The strength of our method is to determine the locations and the number of RBF centers at the same time without any assumption about the number of RBF centers. The proposed method consists of two phases. The first phase is to determine the number and the locations of RBF centers using bi-section search method and enhanced k-medoids clustering which overcomes drawbacks of clustering algorithm. In the second phase, network weights are computed and the design of RBF network is completed. This new method is applied to several benchmark data sets. Benchmark results show that the proposed method is competitive with the previously reported approaches for center selection.

Heuristic for the Pick-up and Delivery Vehicle Routing Problem: Case Study for the Remicon Truck Routing in the Metropolitan Area (배달과 수집을 수행하는 차량경로문제 휴리스틱에 관한 연구: 수도권 레미콘 운송사례)

  • Ji, Chang-Hun;Kim, Mi-Yi;Lee, Young-Hoon
    • Korean Management Science Review
    • /
    • v.24 no.2
    • /
    • pp.43-56
    • /
    • 2007
  • VRP(Vehicle Routing Problem) is studied in this paper, where two different kinds of missions are to be completed. The objective is to minimize the total vehicle operating distance. A mixed integer programming formulation and a heuristic algorithm for a practical use are suggested. A heuristic algorithm consists of three phases such as clustering, constructing routes, and adjustment. In the first phase, customers are clustered so that the supply nodes are grouped with demand nodes to be served by the same vehicle. Vehicle routes are generated within the cluster in the second phase. Clusters and routes are adjusted in the third phase using the UF (unfitness) rule designed to determine the customers and the routes to be moved properly. It is shown that the suggested heuristic algorithm yields good performances within a relatively short computational time through computational experiment.

A Study on the Gustafson-Kessel Clustering Algorithm in Power System Fault Identification

  • Abdullah, Amalina;Banmongkol, Channarong;Hoonchareon, Naebboon;Hidaka, Kunihiko
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.5
    • /
    • pp.1798-1804
    • /
    • 2017
  • This paper presents an approach of the Gustafson-Kessel (GK) clustering algorithm's performance in fault identification on power transmission lines. The clustering algorithm is incorporated in a scheme that uses hybrid intelligent technique to combine artificial neural network and a fuzzy inference system, known as adaptive neuro-fuzzy inference system (ANFIS). The scheme is used to identify the type of fault that occurs on a power transmission line, either single line to ground, double line, double line to ground or three phase. The scheme is also capable an analyzing the fault location without information on line parameters. The range of error estimation is within 0.10 to 0.85 relative to five values of fault resistances. This paper also presents the performance of the GK clustering algorithm compared to fuzzy clustering means (FCM), which is particularly implemented in structuring a data. Results show that the GK algorithm may be implemented in fault identification on power system transmission and performs better than FCM.

Hierarchical Overlapping Clustering to Detect Complex Concepts (중복을 허용한 계층적 클러스터링에 의한 복합 개념 탐지 방법)

  • Hong, Su-Jeong;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.111-125
    • /
    • 2011
  • Clustering is a process of grouping similar or relevant documents into a cluster and assigning a meaningful concept to the cluster. By this process, clustering facilitates fast and correct search for the relevant documents by narrowing down the range of searching only to the collection of documents belonging to related clusters. For effective clustering, techniques are required for identifying similar documents and grouping them into a cluster, and discovering a concept that is most relevant to the cluster. One of the problems often appearing in this context is the detection of a complex concept that overlaps with several simple concepts at the same hierarchical level. Previous clustering methods were unable to identify and represent a complex concept that belongs to several different clusters at the same level in the concept hierarchy, and also could not validate the semantic hierarchical relationship between a complex concept and each of simple concepts. In order to solve these problems, this paper proposes a new clustering method that identifies and represents complex concepts efficiently. We developed the Hierarchical Overlapping Clustering (HOC) algorithm that modified the traditional Agglomerative Hierarchical Clustering algorithm to allow overlapped clusters at the same level in the concept hierarchy. The HOC algorithm represents the clustering result not by a tree but by a lattice to detect complex concepts. We developed a system that employs the HOC algorithm to carry out the goal of complex concept detection. This system operates in three phases; 1) the preprocessing of documents, 2) the clustering using the HOC algorithm, and 3) the validation of semantic hierarchical relationships among the concepts in the lattice obtained as a result of clustering. The preprocessing phase represents the documents as x-y coordinate values in a 2-dimensional space by considering the weights of terms appearing in the documents. First, it goes through some refinement process by applying stopwords removal and stemming to extract index terms. Then, each index term is assigned a TF-IDF weight value and the x-y coordinate value for each document is determined by combining the TF-IDF values of the terms in it. The clustering phase uses the HOC algorithm in which the similarity between the documents is calculated by applying the Euclidean distance method. Initially, a cluster is generated for each document by grouping those documents that are closest to it. Then, the distance between any two clusters is measured, grouping the closest clusters as a new cluster. This process is repeated until the root cluster is generated. In the validation phase, the feature selection method is applied to validate the appropriateness of the cluster concepts built by the HOC algorithm to see if they have meaningful hierarchical relationships. Feature selection is a method of extracting key features from a document by identifying and assigning weight values to important and representative terms in the document. In order to correctly select key features, a method is needed to determine how each term contributes to the class of the document. Among several methods achieving this goal, this paper adopted the $x^2$�� statistics, which measures the dependency degree of a term t to a class c, and represents the relationship between t and c by a numerical value. To demonstrate the effectiveness of the HOC algorithm, a series of performance evaluation is carried out by using a well-known Reuter-21578 news collection. The result of performance evaluation showed that the HOC algorithm greatly contributes to detecting and producing complex concepts by generating the concept hierarchy in a lattice structure.

Data Correlation-Based Clustering Algorithm in Wireless Sensor Networks

  • Yeo, Myung-Ho;Seo, Dong-Min;Yoo, Jae-Soo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.3 no.3
    • /
    • pp.331-343
    • /
    • 2009
  • Many types of sensor data exhibit strong correlation in both space and time. Both temporal and spatial suppressions provide opportunities for reducing the energy cost of sensor data collection. Unfortunately, existing clustering algorithms are difficult to utilize the spatial or temporal opportunities, because they just organize clusters based on the distribution of sensor nodes or the network topology but not on the correlation of sensor data. In this paper, we propose a novel clustering algorithm based on the correlation of sensor data. We modify the advertisement sub-phase and TDMA schedule scheme to organize clusters by adjacent sensor nodes which have similar readings. Also, we propose a spatio-temporal suppression scheme for our clustering algorithm. In order to show the superiority of our clustering algorithm, we compare it with the existing suppression algorithms in terms of the lifetime of the sensor network and the size of data which have been collected in the base station. As a result, our experimental results show that the size of data is reduced and the whole network lifetime is prolonged.

Two-phase Content-based Image Retrieval Using the Clustering of Feature Vector (특징벡터의 끌러스터링 기법을 통한 2단계 내용기반 이미지검색 시스템)

  • 조정원;최병욱
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.40 no.3
    • /
    • pp.171-180
    • /
    • 2003
  • A content-based image retrieval(CBIR) system builds the image database using low-level features such as color, shape and texture and provides similar images that user wants to retrieve when the retrieval request occurs. What the user is interest in is a response time in consideration of the building time to build the index database and the response time to obtain the retrieval results from the query image. In a content-based image retrieval system, the similarity computing time comparing a query with images in database takes the most time in whole response time. In this paper, we propose the two-phase search method with the clustering technique of feature vector in order to minimize the similarity computing time. Experimental results show that this two-phase search method is 2-times faster than the conventional full-search method using original features of ail images in image database, while maintaining the same retrieval relevance as the conventional full-search method. And the proposed method is more effective as the number of images increases.

Analysis of Three-Phase Multiple Access with Continual Contention Resolution (TPMA-CCR) for Wireless Multi-Hop Ad Hoc Networks

  • Choi, Yeong-Yoon;Nosratinia, Aria
    • Journal of Communications and Networks
    • /
    • v.13 no.1
    • /
    • pp.43-49
    • /
    • 2011
  • In this paper, a new medium access control (MAC) protocol entitled three-phase multiple access with continual contention resolution (TPMA-CCR) is proposed for wireless multi-hop ad hoc networks. This work is motivated by the previously known three-phase multiple access (TPMA) scheme of Hou and Tsai [2] which is the suitable MAC protocol for clustering multi-hop ad hoc networks owing to its beneficial attributes such as easy collision detectible, anonymous acknowledgment (ACK), and simple signaling format for the broadcast-natured networks. The new TPMA-CCR is designed to let all contending nodes participate in contentions for a medium access more aggressively than the original TPMA and with continual resolving procedures as well. Through the systematical performance analysis of the suggested protocol, it is also shown that the maximum throughput of the new protocol is not only superior to the original TPMA, but also improves on the conventional slotted carrier sense multiple access (CSMA) under certain circumstances. Thus, in terms of performance, TPMA-CCR can provide an attractive alternative to other contention-based MAC protocols for multi-hop ad hoc networks.

A New Placement Algorithm for Gate Array (새로운 게이트 어레이 배치 알고리듬)

  • Kang, Kyung-Ik;Chong, Jong-Wha
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.5
    • /
    • pp.117-126
    • /
    • 1989
  • In this paper, a new placement algorithm for gate array lay out design is proposed. The proposed algorithm can treat the variable-sized macrocells and by considering the I/Q pad locations, the routing between I/Q pads and the internal region of a chip can be automated effectively. The algorithm is composed of 3 parts. which are initial partitioning, initial placement and placement improvement. In the initial placement phase, a given circuit is partitioned into 5 sub-circuits, by clustering method with considers connectivities of cells not only with I/Q pads but also with related partitioned groups is used repeatedly to assign a unique position to each cell. In the placement improvement phase, the concept of probabilistic wiring density is introduced, and cell moving algorithm is proposed to make the density in a chip even.

  • PDF