• Title/Summary/Keyword: 분류시스템

Search Result 6,478, Processing Time 0.026 seconds

A Study on Word Semantic Categories for Natural Language Question Type Classification and Answer Extraction (자연어 질의 유형판별과 응답 추출을 위한 어휘 의미체계에 관한 연구)

  • Yoon Sung-Hee
    • Proceedings of the KAIS Fall Conference
    • /
    • 2004.11a
    • /
    • pp.141-144
    • /
    • 2004
  • 질의응답 시스템이 정보검색 시스템과 다른 중요한 점은 질의 처리 과정이며, 자연어 질의 문장에서 사용자의 질의 의도를 파악하여 질의 유형을 분류하는 것이다. 본 논문에서는 질의 주-형을 분류하기 위해 복잡한 분류 규칙이나 대용량의 사전 정보를 이용하지 않고 질의 문장에서 의문사에 해당하는 어휘들을 추출하고 주변에 나타나는 명사들의 의미 정보를 이용하여 세부적인 정답 유형을 결정할 수 있는 질의 유형 분류 방법을 제안한다. 의문사가 생략된 경우의 처리 방법과 동의어 정보와 접미사 정보를 이용하여 질의 유형 분류 성능을 향상시킬 수 있는 방법을 제안한다.

  • PDF

인터넷 비즈니스 뮤형 분류를 통한 핵심 성공 요인 도출 및 진화 전략 연구

  • 이기백;최문기
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2000.11a
    • /
    • pp.225-234
    • /
    • 2000
  • 기존의 인터넷 비즈니스를 분류하는 분류법이 인터넷 비즈니스의 특성을 규명하고 새로운 인터넷 비즈니스를 고안하는 데는 적합한 방법이지만 기업들에게 의미 있는 경영전략을 제시하는 데에는 어려움이 있다. 이에 새로운 분류기준으로 인터넷 기업들을 분류하고 경영 전략적인 측면에서 시사하는 바를 알아본 후, 성과모형을 개발하여 인터넷 기업들에게 바람직한 비즈니스 유형을 규명하였다. 또한 인터넷 비즈니스의 핵심 성공요인을 도출하여 향후 기업들이 본 연구 결과를 통해서 기업 성과 증진에 도움을 줄 것으로 기대한다.

  • PDF

Sentiment Analysis System Using Stanford Sentiment Treebank (스탠포드 감성 트리 말뭉치를 이용한 감성 분류 시스템)

  • Lee, Songwook
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.39 no.3
    • /
    • pp.274-279
    • /
    • 2015
  • The main goal of this research is to build a sentiment analysis system which automatically determines user opinions of the Stanford Sentiment Treebank in terms of three sentiments such as positive, negative, and neutral. Firstly, sentiment sentences are POS tagged and parsed to dependency structures. All nodes of the Treebank and their polarities are automatically extracted from the Treebank. We train two Support Vector Machines models. One is for a node level classification and the other is for a sentence level. We have tried various type of features such as word lexicons, POS tags, Sentiment lexicons, head-modifier relations, and sibling relations. Though we acquired 74.2% in accuracy on the test set for 3 class node level classification and 67.0% for 3 class sentence level classification, our experimental results for 2 class classification are comparable to those of the state of art system using the same corpus.

A Study on Customer Optimized Classification System in eCRM (eCRM에서 고객 최적 분류 시스템에 관한 연구)

  • 이재훈;이성주
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.04a
    • /
    • pp.58-61
    • /
    • 2004
  • 최근 기업들의 고객중심 마케팅 기법중 하나인 고객관계관리(CRM:Customer Relationship Management)가 인터넷의 발전으로 온라인화 되고 있으며 다양하게 발전되어 왔다. 가장 대두되고 있는 문제는 고객 분류를 객관적인 방법으로 어떻게 자동화할 수 있는가 이다. 본 논문은 고객 성향 분석과 개인화에서 얻어진 일련의 정보를 다시 한번 더 가공함으로써 고객 집단 편성을 최적화하고 이를 이용하여 고객을 최적으로 분류할 수 있는 시스템을 설계 및 구축하였다.

  • PDF

Comparison Between Optimal Features of Korean and Chinese for Text Classification (한중 자동 문서분류를 위한 최적 자질어 비교)

  • Ren, Mei-Ying;Kang, Sinjae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.4
    • /
    • pp.386-391
    • /
    • 2015
  • This paper proposed the optimal attributes for text classification based on Korean and Chinese linguistic features. The experiments committed to discover which is the best feature among n-grams which is known as language independent, morphemes that have language dependency and some other feature sets consisted with n-grams and morphemes showed best results. This paper used SVM classifier and Internet news for text classification. As a result, bi-gram was the best feature in Korean text categorization with the highest F1-Measure of 87.07%, and for Chinese document classification, 'uni-gram+noun+verb+adjective+idiom', which is the combined feature set, showed the best performance with the highest F1-Measure of 82.79%.

Study of Music Classification Optimized Environment and Atmosphere for Intelligent Musical Fountain System (지능형 음악분수 시스템을 위한 환경 및 분위기에 최적화된 음악분류에 관한 연구)

  • Park, Jun-Heong;Park, Seung-Min;Lee, Young-Hwan;Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.2
    • /
    • pp.218-223
    • /
    • 2011
  • Various research studies are underway to explore music classification by genre. Because sound professionals define the criterion of music to categorize differently each other, those classification is not easy to come up clear result. When a new genre is appeared, there is onerousness to renew the criterion of music to categorize. Therefore, music is classified by emotional adjectives, not genre. We classified music by light and shade in precedent study. In this paper, we propose the music classification system that is based on emotional adjectives to suitable search for atmosphere, and the classification criteria is three kinds; light and shade in precedent study, intense and placid, and grandeur and trivial. Variance Considered Machines that is an improved algorithm for Support Vector Machine was used as classification algorithm, and it represented 85% classification accuracy with the result that we tried to classify 525 songs.

Selection Method of Fuzzy Partitions in Fuzzy Rule-Based Classification Systems (퍼지 규칙기반 분류시스템에서 퍼지 분할의 선택방법)

  • Son, Chang-S.;Chung, Hwan-M.;Kwon, Soon-H.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.360-366
    • /
    • 2008
  • The initial fuzzy partitions in fuzzy rule-based classification systems are determined by considering the domain region of each attribute with the given data, and the optimal classification boundaries within the fuzzy partitions can be discovered by tuning their parameters using various learning processes such as neural network, genetic algorithm, and so on. In this paper, we propose a selection method for fuzzy partition based on statistical information to maximize the performance of pattern classification without learning processes where statistical information is used to extract the uncertainty regions (i.e., the regions which the classification boundaries in pattern classification problems are determined) in each input attribute from the numerical data. Moreover the methods for extracting the candidate rules which are associated with the partition intervals generated by statistical information and for minimizing the coupling problem between the candidate rules are additionally discussed. In order to show the effectiveness of the proposed method, we compared the classification accuracy of the proposed with those of conventional methods on the IRIS and New Thyroid Cancer data. From experimental results, we can confirm the fact that the proposed method only considering statistical information of the numerical patterns provides equal to or better classification accuracy than that of the conventional methods.

Improvement Method of Classification Rate in ML Antivirus systems using Kaggle Datasets (캐글 데이터셋을 이용한 머신러닝 악성코드 분류시스템에서 분류정확도 향상방법)

  • Kim, Kyungshin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.49-52
    • /
    • 2019
  • 머신러닝을 이용한 악성코드 분류 시스템의 대부분이 캐글 데이터셋 10,868건을 사용하여 분류의 정확도를 측정한다. 이 데이터셋에 포함된 바이러스 바이트코드에는 미확인(undefined)필드라는 부분이 과도하게 존재한다. 캐글 데이터셋 특정 Label의 미확인필드 포함도는 75%가 넘는 경우도 존재한다. 이 경우 미확인 필드를 어떻게 처리하느냐가 시스템의 성능에 가장 큰 영향을 끼친다. 본 연구에서는 이러한 캐글 데이터셋의 미확인필드 처리방법을 제시하고 그에 따른 분류 정확도를 연구하였다. 다양한 처리방법에 대한 정확도를 측정하여 제안한 방식의 타당성을 증명하였다.

  • PDF

The Development of Urban Metro Maintenance Facility System Using Construction Classification System Management (공종분류체계를 활용한 도시철도 시설물 유지관리시스템 개발)

  • Hyun, Ji-Hun;Yang, Byong-Soo;Moon, Sung-Woo
    • Korean Journal of Construction Engineering and Management
    • /
    • v.13 no.4
    • /
    • pp.69-77
    • /
    • 2012
  • The construction data should be controlled from the life-cycle perspective. The current PMIS (Project Management Information System) usually focuses on the construction operation stage. The PMIS does consider the utilization of the construction in the maintenance of constructed facilities. This paper tries to interface the construction data with the maintenance data for effective use of construction data in the life-cycle perspective. To achieve the research objective, a maintenance breakdown structure is established and connected to the work breakdown structures. The connection of the two breakdown structures provide a structured utilization of construction data for efficient maintenance work activities. A prototype suggests that the interface of maintenance and work breakdown structures can help provide a construction and maintenance data in a more efficient way for maintenance activities.

Multiple SVM Classifier for Pattern Classification in Data Mining (데이터 마이닝에서 패턴 분류를 위한 다중 SVM 분류기)

  • Kim Man-Sun;Lee Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.3
    • /
    • pp.289-293
    • /
    • 2005
  • Pattern classification extracts various types of pattern information expressing objects in the real world and decides their class. The top priority of pattern classification technologies is to improve the performance of classification and, for this, many researches have tried various approaches for the last 40 years. Classification methods used in pattern classification include base classifier based on the probabilistic inference of patterns, decision tree, method based on distance function, neural network and clustering but they are not efficient in analyzing a large amount of multi-dimensional data. Thus, there are active researches on multiple classifier systems, which improve the performance of classification by combining problems using a number of mutually compensatory classifiers. The present study identifies problems in previous researches on multiple SVM classifiers, and proposes BORSE, a model that, based on 1:M policy in order to expand SVM to a multiple class classifier, regards each SVM output as a signal with non-linear pattern, trains the neural network for the pattern and combine the final results of classification performance.