• Title/Summary/Keyword: K-NN 분류 모델

Search Result 39, Processing Time 0.024 seconds

Enriching Core Ontology with Domain Thesaurus (분야 시소러스를 이용한 코아 온톨로지 확장)

  • Huang, Jin-Xia;Shin, Ji-Ae;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 2007.10a
    • /
    • pp.31-37
    • /
    • 2007
  • 본 논문에서는 분야 시소러스의 개념과 관계를 이용하여 코아 온톨로지를 확장하는 방법을 제안한다. 분야 시소러스의 개념을 코아 온톨로지의 상위 개념으로 분류하고, 시소러스에서의 광의어(Broader Term: BT)-협의어(Narrower Term: NT) 및 광의어-관련어(Related Term: RT)들 사이의 관계는 코아 온톨로지에서 정의한 의미관계로 분류한다. 유사도와 빈도수 기반의 방법으로 개념 분류를 수행하였고, 관계 분류에서는 두 가지 방법을 적용하였는데, (i) 훈련데이터가 부족한 경우를 위하여 규칙기반 방법으로 BT-NT/RT관계를 isa와 기타 관계(non-isa관계)로 분류하고, 패턴기반 방법으로 non-isa관계를 온톨로지를 위한 의미관계로 분류한다. (ii) 훈련데이터를 충분히 가지고 있을 경우, 최대 엔트로피 모델(MEM)을 적용한 분류 방법을 사용하되, kNN방법으로 훈련데이터를 정제하였다. 본 논문에서 제안한 방법으로 시스템을 구축하였고, 실험 결과, 시스템 성능이 사람에 의한 판단 결과와 비교 가능한 수준이었다.

  • PDF

A Study of the Feature Classification and the Predictive Model of Main Feed-Water Flow for Turbine Cycle (주급수 유량의 형상 분류 및 추정 모델에 대한 연구)

  • Yang, Hac Jin;Kim, Seong Kun;Choi, Kwang Hee
    • Journal of Energy Engineering
    • /
    • v.23 no.4
    • /
    • pp.263-271
    • /
    • 2014
  • Corrective thermal performance analysis is required for thermal power plants to determine performance status of turbine cycle. We developed classification method for main feed water flow to make precise correction for performance analysis based on ASME (American Society of Mechanical Engineers) PTC (Performance Test Code). The classification is based on feature identification of status of main water flow. Also we developed predictive algorithms for corrected main feed-water through Support Vector Machine (SVM) Model for each classified feature area. The results was compared to estimations using Neural Network(NN) and Kernel Regression(KR). The feature classification and predictive model of main feed-water flow provides more practical methods for corrective thermal performance analysis of turbine cycle.

Stiffness Enhancement of Piecewise Integrated Composite Beam using 3D Training Data Set (3차원 학습 데이터를 이용한 PIC 보의 강성 향상에 대한 연구)

  • Ji, Seungmin;Ham, Seok Woo;Choi, Jin Kyung;Cheon, Seong S.
    • Composites Research
    • /
    • v.34 no.6
    • /
    • pp.394-399
    • /
    • 2021
  • Piecewise Integrated Composite (PIC) is a new concept to design composite structures of multiple stacking angles both for in-plane direction and through the thickness direction in order to improve stiffness and strength. In the present study, PIC beam was suggested based on 3D training data instead of 2D data, which did offer a limited behavior of beam characteristics, with enhancing the stiffness accompanied by reduced tip deformation. Generally training data were observed from the designated reference finite elements, and preliminary FE analysis was conducted with respect to regularly distributed reference elements. Also triaxiality values for each element were obtained in order to categorize the loading state, i.e. tensile, compressive or shear. The main FE analysis was conducted to predict the mechanical characteristics of the PIC beam.

Building Domain Ontology through Concept and Relation Classification (개념 및 관계 분류를 통한 분야 온톨로지 구축)

  • Huang, Jin-Xia;Shin, Ji-Ae;Choi, Key-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.562-571
    • /
    • 2008
  • For the purpose of building domain ontology, this paper proposes a methodology for building core ontology first, and then enriching the core ontology with the concepts and relations in the domain thesaurus. First, the top-level concept taxonomy of the core ontology is built using domain dictionary and general domain thesaurus. Then, the concepts of the domain thesaurus are classified into top-level concepts in the core ontology, and relations between broader terms (BT) - narrower terms (NT) and related terms (RT) are classified into semantic relations defined for the core ontology. To classify concepts, a two-step approach is adopted, in which a frequency-based approach is complemented with a similarity-based approach. To classify relations, two techniques are applied: (i) for the case of insufficient training data, a rule-based module is for identifying isa relation out of non-isa ones; a pattern-based approach is for classifying non-taxonomic semantic relations from non-isa. (ii) For the case of sufficient training data, a maximum-entropy model is adopted in the feature-based classification, where k-NN approach is for noisy filtering of training data. A series of experiments show that performances of the proposed systems are quite promising and comparable to judgments by human experts.

Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach (온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여)

  • Lee, Ji Hyeon;Jung, Sang Hyung;Kim, Jun Ho;Min, Eun Joo;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.97-117
    • /
    • 2020
  • Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers' opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor. Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags. After setting up preparation phase, criteria extraction phase is conducted with low-level categories. This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set. To evaluate the performance of the proposed model, an experiment was conducted with review on '11st', one of the biggest e-commerce companies in Korea. Review data were from 'Electronics/Digital' section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above. Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy. This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.

A Study on Optimization of Partial Discharge Pattern Recognition using Genetic Algorithm (Genetic Algorithm을 이용한 부분방전 패턴인식 최적화 연구)

  • Kim, Seong-Il;Jung, Seung-Yong;Koo, Ja-Yoon;Jang, Yong-Mu
    • Proceedings of the KIEE Conference
    • /
    • 2006.10a
    • /
    • pp.145-146
    • /
    • 2006
  • 본 논문은 부분방전(PD: Partial Discharge)의 패턴인식 확률 극대화를 목적으로 신경망(NN: Neural Network) 파라미터 중에서 은닉층 뉴런의 수, 모멘텀(momentum)의 Step size와 Decay rate 를 최적화하기 위하여 유전 알고리즘(GA: Genetic Algonthm)을 적응하였다. 실험적 연구의 대상으로서, GIS(Gas Insulated Switchgear)사고의 주요 원인으로 보고되어있는 결함들을 인위적으로 모의한 16개 Test cell을 이용하여 부분방전을 발생시켰다. 부분방전 신호는 본 연구팀이 개발한 센서를 이용하여 검출되어 데이터베이스가 구축되어 그로부터 추출된 학습 데이터들의 학습에 다음과 같은 5가지 신경망 모델이 적응되었다: Multilayer Perception (MLP), Jordan-Elman Network (JEN), Recurrent Network (RN), Self-Organizing Feature Map (SOFM), Time-Lag Recurrent Network (TLRN). 유전 알고리즘 적용 효율성을 분석하기 위하여 동일한 데이터를 이용하여 다음과 같은 두 가지 방법을 적용한 결과를 상호 비교하였다. 우선 상기 선택된 모델만 적용하였고 다근 하나는 상기 모델과 Genetic Algorithm이 동시에 적용되었다. 모든 모델에 대하여 학습오차와 패턴 분류 확률을 비교한 결과, 유전 알고리즘 적응 시 부분방전 패턴인식 확률이 향상되었음이 확인되어 향후 신뢰성 있는 GIS 부분방전 진단기술에 활용될 수 있을 것으로 사료된다.

  • PDF

Development of Interactive Content Services through an Intelligent IoT Mirror System (지능형 IoT 미러 시스템을 활용한 인터랙티브 콘텐츠 서비스 구현)

  • Jung, Wonseok;Seo, Jeongwook
    • Journal of Advanced Navigation Technology
    • /
    • v.22 no.5
    • /
    • pp.472-477
    • /
    • 2018
  • In this paper, we develop interactive content services for preventing depression of users through an intelligent Internet of Things(IoT) mirror system. For interactive content services, an IoT mirror device measures attention and meditation data from an EEG headset device and also measures facial expression data such as "sad", "angery", "disgust", "neutral", " happy", and "surprise" classified by a multi-layer perceptron algorithm through an webcam. Then, it sends the measured data to an oneM2M-compliant IoT server. Based on the collected data in the IoT server, a machine learning model is built to classify three levels of depression (RED, YELLOW, and GREEN) given by a proposed merge labeling method. It was verified that the k-nearest neighbor (k-NN) model could achieve about 93% of accuracy by experimental results. In addition, according to the classified level, a social network service agent sent a corresponding alert message to the family, friends and social workers. Thus, we were able to provide an interactive content service between users and caregivers.

The PIC Bumper Beam Design Method with Machine Learning Technique (머신 러닝 기법을 이용한 PIC 범퍼 빔 설계 방법)

  • Ham, Seokwoo;Ji, Seungmin;Cheon, Seong S.
    • Composites Research
    • /
    • v.35 no.5
    • /
    • pp.317-321
    • /
    • 2022
  • In this study, the PIC design method with machine learning that automatically assigning different stacking sequences according to loading types was applied bumper beam. The input value and labels of the training data for applying machine learning were defined as coordinates and loading types of reference elements that are part of the total elements, respectively. In order to compare the 2D and 3D implementation method, which are methods of representing coordinate value, training data were generated, and machine learning models were trained with each method. The 2D implementation method is divided FE model into each face and generating learning data and training machine learning models accordingly. The 3D implementation method is training one machine learning model by generating training data from the entire finite element model. The hyperparameter were tuned to optimal values through the Bayesian algorithm, and the k-NN classification method showed the highest prediction rate and AUC-ROC among the tuned models. The 3D implementation method revealed higher performance than the 2D implementation method. The loading type data predicted through the machine learning model were mapped to the finite element model and comparatively verified through FE analysis. It was found that 3D implementation PIC bumper beam was superior to 2D implementation and uni-stacking sequence composite bumper.

Morphological Variation Classification of Red Blood Cells using Neural Network Model in the Peripheral Blood Images (말초혈액영상에서 신경망 모델을 이용한 적혈구의 형태학적 변이 분류)

  • Kim, Gyeong-Su;Kim, Pan-Gu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.10
    • /
    • pp.2707-2715
    • /
    • 1999
  • Recently, there have been researches to automate processing and analysing images in the medical field using image processing technique, a fast communication network, and high performance hardware. In this paper, we propose a system to be able to analyze morphological abnormality of red-blood cells for peripheral blood image using image processing techniques. To do this, we segment red-blood cells in the blood image acquired from microscope with CCD camera and then extract UNL fourier features to classify them into 15 classes. We reduce the number of multi-variate features using PCA to construct a more efficient classifier. Our system has the best performance in recognition rate, compared with two other algorithms, LVQ3 and k-NN. So, we show that it can be applied to a pathological guided system.

  • PDF

Machine Learning Based Structural Health Monitoring System using Classification and NCA (분류 알고리즘과 NCA를 활용한 기계학습 기반 구조건전성 모니터링 시스템)

  • Shin, Changkyo;Kwon, Hyunseok;Park, Yurim;Kim, Chun-Gon
    • Journal of Advanced Navigation Technology
    • /
    • v.23 no.1
    • /
    • pp.84-89
    • /
    • 2019
  • This is a pilot study of machine learning based structural health monitoring system using flight data of composite aircraft. In this study, the most suitable machine learning algorithm for structural health monitoring was selected and dimensionality reduction method for application on the actual flight data was conducted. For these tasks, impact test on the cantilever beam with added mass, which is the simulation of damage in the aircraft wing structure was conducted and classification model for damage states (damage location and level) was trained. Through vibration test of cantilever beam with fiber bragg grating (FBG) sensor, data of normal and 12 damaged states were acquired, and the most suitable algorithm was selected through comparison between algorithms like tree, discriminant, support vector machine (SVM), kNN, ensemble. Besides, through neighborhood component analysis (NCA) feature selection, dimensionality reduction which is necessary to deal with high dimensional flight data was conducted. As a result, quadratic SVMs performed best with 98.7% for without NCA and 95.9% for with NCA. It is also shown that the application of NCA improved prediction speed, training time, and model memory.