• Title/Summary/Keyword: Label Clustering

Search Result 34, Processing Time 0.022 seconds

Feature Selection of Fuzzy Pattern Classifier by using Fuzzy Mapping (퍼지 매핑을 이용한 퍼지 패턴 분류기의 Feature Selection)

  • Roh, Seok-Beom;Kim, Yong Soo;Ahn, Tae-Chon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.646-650
    • /
    • 2014
  • In this paper, in order to avoid the deterioration of the pattern classification performance which results from the curse of dimensionality, we propose a new feature selection method. The newly proposed feature selection method is based on Fuzzy C-Means clustering algorithm which analyzes the data points to divide them into several clusters and the concept of a function with fuzzy numbers. When it comes to the concept of a function where independent variables are fuzzy numbers and a dependent variable is a label of class, a fuzzy number should be related to the only one class label. Therefore, a good feature is a independent variable of a function with fuzzy numbers. Under this assumption, we calculate the goodness of each feature to pattern classification problem. Finally, in order to evaluate the classification ability of the proposed pattern classifier, the machine learning data sets are used.

Unified Labeling and Fine-Grained Verification for Improving Ground-Truth of Malware Analysis (악성코드 분석의 Ground-Truth 향상을 위한 Unified Labeling과 Fine-Grained 검증)

  • Oh, Sang-Jin;Park, Leo-Hyun;Kwon, Tae-Kyoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.3
    • /
    • pp.549-555
    • /
    • 2019
  • According to a recent report by anti-virus vendors, the number of new and modified malware increased exponentially. Therefore, malware analysis research using machine learning has been actively researched in order to replace passive analysis method which has low analysis speed. However, when using supervised learning based machine learning, many studies use low-reliability malware family name provided by the antivirus vendor as the label. In order to solve the problem of low-reliability of malware label, this paper introduces a new labeling technique, "Unified Labeling", and further verifies the malicious behavior similarity through the feature analysis of the fine-grained method. To verify this study, various clustering algorithms were used and compared with existing labeling techniques.

Prediction of ship power based on variation in deep feed-forward neural network

  • Lee, June-Beom;Roh, Myung-Il;Kim, Ki-Su
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.13 no.1
    • /
    • pp.641-649
    • /
    • 2021
  • Fuel oil consumption (FOC) must be minimized to determine the economic route of a ship; hence, the ship power must be predicted prior to route planning. For this purpose, a numerical method using test results of a model has been widely used. However, predicting ship power using this method is challenging owing to the uncertainty of the model test. An onboard test should be conducted to solve this problem; however, it requires considerable resources and time. Therefore, in this study, a deep feed-forward neural network (DFN) is used to predict ship power using deep learning methods that involve data pattern recognition. To use data in the DFN, the input data and a label (output of prediction) should be configured. In this study, the input data are configured using ocean environmental data (wave height, wave period, wave direction, wind speed, wind direction, and sea surface temperature) and the ship's operational data (draft, speed, and heading). The ship power is selected as the label. In addition, various treatments have been used to improve the prediction accuracy. First, ocean environmental data related to wind and waves are preprocessed using values relative to the ship's velocity. Second, the structure of the DFN is changed based on the characteristics of the input data. Third, the prediction accuracy is analyzed using a combination comprising five hyperparameters (number of hidden layers, number of hidden nodes, learning rate, dropout, and gradient optimizer). Finally, k-means clustering is performed to analyze the effect of the sea state and ship operational status by categorizing it into several models. The performances of various prediction models are compared and analyzed using the DFN in this study.

Dual graph-regularized Constrained Nonnegative Matrix Factorization for Image Clustering

  • Sun, Jing;Cai, Xibiao;Sun, Fuming;Hong, Richang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2607-2627
    • /
    • 2017
  • Nonnegative matrix factorization (NMF) has received considerable attention due to its effectiveness of reducing high dimensional data and importance of producing a parts-based image representation. Most of existing NMF variants attempt to address the assertion that the observed data distribute on a nonlinear low-dimensional manifold. However, recent research results showed that not only the observed data but also the features lie on the low-dimensional manifolds. In addition, a few hard priori label information is available and thus helps to uncover the intrinsic geometrical and discriminative structures of the data space. Motivated by the two aspects above mentioned, we propose a novel algorithm to enhance the effectiveness of image representation, called Dual graph-regularized Constrained Nonnegative Matrix Factorization (DCNMF). The underlying philosophy of the proposed method is that it not only considers the geometric structures of the data manifold and the feature manifold simultaneously, but also mines valuable information from a few known labeled examples. These schemes will improve the performance of image representation and thus enhance the effectiveness of image classification. Extensive experiments on common benchmarks demonstrated that DCNMF has its superiority in image classification compared with state-of-the-art methods.

Image Clustering Using Machine Learning : Study of InceptionV3 with K-means Methods. (머신 러닝을 사용한 이미지 클러스터링: K-means 방법을 사용한 InceptionV3 연구)

  • Nindam, Somsauwt;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.681-684
    • /
    • 2021
  • In this paper, we study image clustering without labeling using machine learning techniques. We proposed an unsupervised machine learning technique to design an image clustering model that automatically categorizes images into groups. Our experiment focused on inception convolutional neural networks (inception V3) with k-mean methods to cluster images. For this, we collect the public datasets containing Food-K5, Flowers, Handwritten Digit, Cats-dogs, and our dataset Rice Germination, and the owner dataset Palm print. Our experiment can expand into three-part; First, format all the images to un-label and move to whole datasets. Second, load dataset into the inception V3 extraction image features and transferred to the k-mean cluster group hold on six classes. Lastly, evaluate modeling accuracy using the confusion matrix base on precision, recall, F1 to analyze. In this our methods, we can get the results as 1) Handwritten Digit (precision = 1.000, recall = 1.000, F1 = 1.00), 2) Food-K5 (precision = 0.975, recall = 0.945, F1 = 0.96), 3) Palm print (precision = 1.000, recall = 0.999, F1 = 1.00), 4) Cats-dogs (precision = 0.997, recall = 0.475, F1 = 0.64), 5) Flowers (precision = 0.610, recall = 0.982, F1 = 0.75), and our dataset 6) Rice Germination (precision = 0.997, recall = 0.943, F1 = 0.97). Our experiment showed that modeling could get an accuracy rate of 0.8908; the outcomes state that the proposed model is strongest enough to differentiate the different images and classify them into clusters.

Building of Database Retrieval System Based on Knowledge using FCM (FCM을 이용한 지식기반 데이터베이스 검색 시스템의 구축)

  • 박계각;서기열;천대일;양원재
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.1
    • /
    • pp.88-93
    • /
    • 2001
  • 기존의 데이터베이스 검색시스템은 사용자의 검색 조건에 정확히 일치하는 데이터가 데이터베이스 내에 존재할 경우에만 사용자에게 해당 데이터를 제공할 수 있고, 사용자의 검색조건을 정확히 만족하는 데이터가 없을 경우에는 적절한 데이터를 제공할 수 없는 문제점이 있다. 이러한 문제를 해결하기 위하여 본 논문에서는 FCM의 클러스터증가 및 재초기화 알고리즘을 제안하였고, FCM을 이용하여 데이터베이스 내의 데이터로부터 구축된 지식기반 데이터베이스(KDB)와 구축된 이미지 데이터베이스와 연동을 통하여 사용자의 요구에 가장 근접한 데이터를 제시해 주는 검색시스템을 제안하였다. 본 연구에서 제안된 수법을 우체국의 우편주문안내책자를 이용한 선물고르기 DB 검색 시스템에 적용하여 그 유효성을 확인하였다.

  • PDF

Intelligent Intrusion Detection and Prevention System using Smart Multi-instance Multi-label Learning Protocol for Tactical Mobile Adhoc Networks

  • Roopa, M.;Raja, S. Selvakumar
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.6
    • /
    • pp.2895-2921
    • /
    • 2018
  • Security has become one of the major concerns in mobile adhoc networks (MANETs). Data and voice communication amongst roaming battlefield entities (such as platoon of soldiers, inter-battlefield tanks and military aircrafts) served by MANETs throw several challenges. It requires complex securing strategy to address threats such as unauthorized network access, man in the middle attacks, denial of service etc., to provide highly reliable communication amongst the nodes. Intrusion Detection and Prevention System (IDPS) undoubtedly is a crucial ingredient to address these threats. IDPS in MANET is managed by Command Control Communication and Intelligence (C3I) system. It consists of networked computers in the tactical battle area that facilitates comprehensive situation awareness by the commanders for timely and optimum decision-making. Key issue in such IDPS mechanism is lack of Smart Learning Engine. We propose a novel behavioral based "Smart Multi-Instance Multi-Label Intrusion Detection and Prevention System (MIML-IDPS)" that follows a distributed and centralized architecture to support a Robust C3I System. This protocol is deployed in a virtually clustered non-uniform network topology with dynamic election of several virtual head nodes acting as a client Intrusion Detection agent connected to a centralized server IDPS located at Command and Control Center. Distributed virtual client nodes serve as the intelligent decision processing unit and centralized IDPS server act as a Smart MIML decision making unit. Simulation and experimental analysis shows the proposed protocol exhibits computational intelligence with counter attacks, efficient memory utilization, classification accuracy and decision convergence in securing C3I System in a Tactical Battlefield environment.

Intelligent DB Retrieval System for Marine Accidents Using FCM (FCM을 이용한 지능형 해양사고 DB 검색시스템 구축)

  • Park, Gyei-Kark;Han, Xu;Kim, Young-Ki;Oh, Se-Woong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.4
    • /
    • pp.568-573
    • /
    • 2009
  • Marine accidents have always caused huge economic losses, as well as environmental pollution. Prevention of marine accidents has become a focus of argumentation. The analysis of past accident cases, reviewing the experience and lessons, is important and necessary for preventing marine accidents. With the same subject above, the Korean Maritime Safety Tribunal provides for past marine accidents' written judgments and analysis of judgment and associated retrieval system on its homepage. In these systems, the name of the ship, accident occurrence time, accident pattern or related keywords are used as search conditions. However, most of the marine events' happening were not due to a single reason, but multiple ones. In addition, one marine event could often come under several categories. In this case, now the retrieval systems' DB is used on the Korean Maritime Safety Tribunal homepage was built based on single category and failed to be able to retrieve according to multiple reasons or multiple categories. In order to solve this problem, a more practical retrieval approach might be needed. Therefore, in this paper, a new retrieval system will be proposed, which using the linguistic label to describe the cluster after analyzing the relational properties between marine accidents and clustering by FCM algorithm, and then adding an interface to allow users to get the results they want through choosing multiple reasons or multiple categories.

Visual Model of Pattern Design Based on Deep Convolutional Neural Network

  • Jingjing Ye;Jun Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.311-326
    • /
    • 2024
  • The rapid development of neural network technology promotes the neural network model driven by big data to overcome the texture effect of complex objects. Due to the limitations in complex scenes, it is necessary to establish custom template matching and apply it to the research of many fields of computational vision technology. The dependence on high-quality small label sample database data is not very strong, and the machine learning system of deep feature connection to complete the task of texture effect inference and speculation is relatively poor. The style transfer algorithm based on neural network collects and preserves the data of patterns, extracts and modernizes their features. Through the algorithm model, it is easier to present the texture color of patterns and display them digitally. In this paper, according to the texture effect reasoning of custom template matching, the 3D visualization of the target is transformed into a 3D model. The high similarity between the scene to be inferred and the user-defined template is calculated by the user-defined template of the multi-dimensional external feature label. The convolutional neural network is adopted to optimize the external area of the object to improve the sampling quality and computational performance of the sample pyramid structure. The results indicate that the proposed algorithm can accurately capture the significant target, achieve more ablation noise, and improve the visualization results. The proposed deep convolutional neural network optimization algorithm has good rapidity, data accuracy and robustness. The proposed algorithm can adapt to the calculation of more task scenes, display the redundant vision-related information of image conversion, enhance the powerful computing power, and further improve the computational efficiency and accuracy of convolutional networks, which has a high research significance for the study of image information conversion.

A Study on GPR Image Classification by Semi-supervised Learning with CNN (CNN 기반의 준지도학습을 활용한 GPR 이미지 분류)

  • Kim, Hye-Mee;Bae, Hye-Rim
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.197-206
    • /
    • 2021
  • GPR data is used for underground exploration. The data gathered are interpreted by experts based on experience as the underground facilities often reflect GPR. In addition, GPR data are different in the noise and characteristics of the data depending on the equipment, environment, etc. This often results in insufficient data with accurate labels. Generally, a large amount of training data have to be obtained to apply CNN models that exhibit high performance in image classification problems. However, due to the characteristics of GPR data, it makes difficult to obtain sufficient data. Finally, this makes neural networks unable to learn based on general supervised learning methods. This paper proposes an image classification method considering data characteristics to ensure that the accuracy of each label is similar. The proposed method is based on semi-supervised learning, and the image is classified using clustering techniques after extracting the feature values of the image from the neural network. This method can be utilized not only when the amount of the labeled data is insufficient, but also when labels that depend on the data are not highly reliable.