• Title/Summary/Keyword: 기술 분류

Search Result 6,587, Processing Time 0.037 seconds

(A Question Type Classifier based on a Support Vector Machine for a Korean Question-Answering System) (한국어 질의응답시스템을 위한 지지 벡터기계 기반의 질의유형분류기)

  • 김학수;안영훈;서정연
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.5_6
    • /
    • pp.466-475
    • /
    • 2003
  • To build an efficient Question-Answering (QA) system, a question type classifier is needed. It can classify user's queries into predefined categories regardless of the surface form of a question. In this paper, we propose a question type classifier using a Support Vector Machine (SVM). The question type classifier first extracts features like lexical forms, part of speech and semantic markers from a user's question. The system uses $X^2$ statistic to select important features. Selected features are represented as a vector. Finally, a SVM categorizes questions into predefined categories according to the extracted features. In the experiment, the proposed system accomplished 86.4% accuracy The system precisely classifies question type without using any rules like lexico-syntactic patterns. Therefore, the system is robust and easily portable to other domains.

Improving the Performance of SVM Text Categorization with Inter-document Similarities (문헌간 유사도를 이용한 SVM 분류기의 문헌분류성능 향상에 관한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.22 no.3 s.57
    • /
    • pp.261-287
    • /
    • 2005
  • The purpose of this paper is to explore the ways to improve the performance of SVM (Support Vector Machines) text classifier using inter-document similarities. SVMs are powerful machine learning systems, which are considered as the state-of-the-art technique for automatic document classification. In this paper text categorization via SVMs approach based on feature representation with document vectors is suggested. In this approach, document vectors instead of index terms are used as features, and vector similarities instead of term weights are used as feature values. Experiments show that SVM classifier with document vector features can improve the document classification performance. For the sake of run-time efficiency, two methods are developed: One is to select document vector features, and the other is to use category centroid vector features instead. Experiments on these two methods show that we can get improved performance with small vector feature set than the performance of conventional methods with index term features.

Comparison and Analysis of Subject Classification for Domestic Research Data (국내 학술논문 주제 분류 알고리즘 비교 및 분석)

  • Choi, Wonjun;Sul, Jaewook;Jeong, Heeseok;Yoon, Hwamook
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.178-186
    • /
    • 2018
  • Subject classification of thesis units is essential to serve scholarly information deliverables. However, to date, there is a journal-based topic classification, and there are not many article-level subject classification services. In the case of academic papers among domestic works, subject classification can be a more important information because it can cover a larger area of service and can provide service by setting a range. However, the problem of classifying themes by field requires the hands of experts in various fields, and various methods of verification are needed to increase accuracy. In this paper, we try to classify topics using the unsupervised learning algorithm to find the correct answer in the unknown state and compare the results of the subject classification algorithms using the coherence and perplexity. The unsupervised learning algorithms are a well-known Hierarchical Dirichlet Process (HDP), Latent Dirichlet Allocation (LDA) and Latent Semantic Indexing (LSI) algorithm.

A Machine learning Approach for Knowledge Base Construction Incorporating GIS Data for land Cover Classification of Landsat ETM+ Image (지식 기반 시스템에서 GIS 자료를 활용하기 위한 기계 학습 기법에 관한 연구 - Landsat ETM+ 영상의 토지 피복 분류를 사례로)

  • Kim, Hwa-Hwan;Ku, Cha-Yang
    • Journal of the Korean Geographical Society
    • /
    • v.43 no.5
    • /
    • pp.761-774
    • /
    • 2008
  • Integration of GIS data and human expert knowledge into digital image processing has long been acknowledged as a necessity to improve remote sensing image analysis. We propose inductive machine learning algorithm for GIS data integration and rule-based classification method for land cover classification. Proposed method is tested with a land cover classification of a Landsat ETM+ multispectral image and GIS data layers including elevation, aspect, slope, distance to water bodies, distance to road network, and population density. Decision trees and production rules for land cover classification are generated by C5.0 inductive machine learning algorithm with 350 stratified random point samples. Production rules are used for land cover classification integrated with unsupervised ISODATA classification. Result shows that GIS data layers such as elevation, distance to water bodies and population density can be effectively integrated for rule-based image classification. Intuitive production rules generated by inductive machine learning are easy to understand. Proposed method demonstrates how various GIS data layers can be integrated with remotely sensed imagery in a framework of knowledge base construction to improve land cover classification.

A Study on the Implementation of SQL Primitives for Decision Tree Classification (판단 트리 분류를 위한 SQL 기초 기능의 구현에 관한 연구)

  • An, Hyoung Geun;Koh, Jae Jin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.12
    • /
    • pp.855-864
    • /
    • 2013
  • Decision tree classification is one of the important problems in data mining fields and data minings have been important tasks in the fields of large database technologies. Therefore the coupling efforts of data mining systems and database systems have led the developments of database primitives supporting data mining functions such as decision tree classification. These primitives consist of the special database operations which support the SQL implementation of decision tree classification algorithms. These primitives have become the consisting modules of database systems for the implementations of the specific algorithms. There are two aspects in the developments of database primitives which support the data mining functions. The first is the identification of database common primitives which support data mining functions by analysis. The other is the provision of the extended mechanism for the implementations of these primitives as an interface of database systems. In data mining, some primitives want be stored in DBMS is one of the difficult problems. In this paper, to solve of the problem, we describe the database primitives which construct and apply the optimized decision tree classifiers. Then we identify the useful operations for various classification algorithms and discuss the implementations of these primitives on the commercial DBMS. We implement these primitives on the commercial DBMS and present experimental results demonstrating the performance comparisons.

Component Analysis for Constructing an Emotion Ontology (감정 온톨로지의 구축을 위한 구성요소 분석)

  • Yoon, Ae-Sun;Kwon, Hyuk-Chul
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.1
    • /
    • pp.157-175
    • /
    • 2010
  • Understanding dialogue participant's emotion is important as well as decoding the explicit message in human communication. It is well known that non-verbal elements are more suitable for conveying speaker's emotions than verbal elements. Written texts, however, contain a variety of linguistic units that express emotions. This study aims at analyzing components for constructing an emotion ontology, that provides us with numerous applications in Human Language Technology. A majority of the previous work in text-based emotion processing focused on the classification of emotions, the construction of a dictionary describing emotion, and the retrieval of those lexica in texts through keyword spotting and/or syntactic parsing techniques. The retrieved or computed emotions based on that process did not show good results in terms of accuracy. Thus, more sophisticate components analysis is proposed and the linguistic factors are introduced in this study. (1) 5 linguistic types of emotion expressions are differentiated in terms of target (verbal/non-verbal) and the method (expressive/descriptive/iconic). The correlations among them as well as their correlation with the non-verbal expressive type are also determined. This characteristic is expected to guarantees more adaptability to our ontology in multi-modal environments. (2) As emotion-related components, this study proposes 24 emotion types, the 5-scale intensity (-2~+2), and the 3-scale polarity (positive/negative/neutral) which can describe a variety of emotions in more detail and in standardized way. (3) We introduce verbal expression-related components, such as 'experiencer', 'description target', 'description method' and 'linguistic features', which can classify and tag appropriately verbal expressions of emotions. (4) Adopting the linguistic tag sets proposed by ISO and TEI and providing the mapping table between our classification of emotions and Plutchik's, our ontology can be easily employed for multilingual processing.

  • PDF

Field and remote acquisition of hyperspectral information for classification of riverside area materials (현장 및 원격 초분광 정보 계측을 통한 하천 수변공간 재료 구분)

  • Shin, Jaehyun;Seong, Hoje;Rhee, Dong Sop
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.12
    • /
    • pp.1265-1274
    • /
    • 2021
  • The analysis of hyperspectral characteristics of materials near the South Han River has been conducted using riverside area measurements by drone installed hyperspectral sensors. Each spectrum reflectance of the riverside materials were compared and analyzed which were consisted of grass, concrete, soil, etc. To verify the drone installed hyperspectral measurements, a ground spectrometer was deployed for field measurements and comparisons for the materials. The comparison results showed that the riverside materials had their unique hyperspectral band characteristics, and the field measurements were similar to the remote sensing data. For the classification of the riverside area, the K-means clustering method and SVM classification method were utilized. The supervised SVM method showed accurate classification of the riverside area than the unsupervised K-means method. Using classification and clustering methods, the inherent spectral characteristic for each material was found to classify the riverside materials of hyperspectral images from drones.

Practical Approach for Quantitative and Qualitative Analyses of Marine Ciliate Plankton (해양 섬모충플랑크톤 정량과 정성분석의 현실적 접근)

  • KIM, YOUNG OK;KIM, SUN YOUNG;CHOI, JUNGMIN;KIM, JAESEONG
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.26 no.3
    • /
    • pp.248-262
    • /
    • 2021
  • Marine planktonic ciliates include two major groups, loricated tintinnids and naked oligotrichs. The study of marine ciliate plankton in Korea began with taxonomic efforts on tintinnids based on the morphology of lorica, a vase-shaped shell. Despite polymorphism in the lorica, it is utilized as a key characteristic in identification of tintinnid species. However, oligotrichs have been studied only recently in Korea due to challenges associated with the observation of ciliary arrangements and the technical development for cell staining. Species diversity and phylogenetic classification of the ciliates have been informed by recent advances in morphological and molecular analyses. Illustrations of the planktonic ciliate in Korea have been published on the basis of taxonomic data of tintinnids and oligotrichs. Planktonic ciliates acting as the major consumers of pico- and nanoplankton as well as the prey of mesozooplankton, has been monitored by spatial and temporal investigations in Korean coastal waters. A practical approach addressing the limitations and potential of marine ciliate studies in Korea is proposed here to improve the data quality of planktonic ciliates, providing an enhanced basis for quality control of ciliate monitoring.

Decision-making Framework for Risk-based Site Management and Use of Risk Mitigation Measures (위해성기반 오염부지관리를 위한 의사결정체계 및 이를 위한 위해저감기술의 활용)

  • Chung, Hyeonyong;Kim, Sang Hyun;Lee, Hosub;Nam, Kyoungphile
    • Journal of Soil and Groundwater Environment
    • /
    • v.25 no.3
    • /
    • pp.32-42
    • /
    • 2020
  • 오염부지 관리 기조가 매체 중심에서 수용체 중심으로 변화하면서 우리나라에 위해성평가 제도가 도입되었으나, 이를 오염현장에 충분히 활용하기 위한 체계와 관련 기술들은 아직 제대로 확립되어 있지 않다. 특히, 여러 가지 이유로 정화곤란부지로 분류가 되는 오염부지의 정화 및 관리와 그러한 부지에 적용될 수 있는 위해저감기술들에 대한 기술적, 사회적 논의와 합의도 부족한 실정이다. 본 연구에서는 그동안 오염토양의 정화에만 초점이 맞추어진 우리나라의 토양환경정책이 오염부지의 관점에서 그와 연결된 수용체를 보호하는 방향으로 나아가기 위해 필요한 위해성기반 오염 부지관리 의사결정체계를 제안하고, 그러한 관리체계가 현장에서 적절히 적용되도록 하기 위해 필요한 위해저감기술들을 조사, 분류하여 위해저감 방식에 따른 위해저감기술의 활용성 및 적용성을 평가하는 방안을 제안하였다.

A study on the development of instream habitat creation technique (하도 내 생물서식처 조성기술개발에 관한 연구)

  • Kim, Si-Nae;Lee, Dong-Jun;Ahn, Hong-Kyu
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2011.05a
    • /
    • pp.117-121
    • /
    • 2011
  • 기존 국내의 자연형 하천복원사업은 인위적으로 정비된 하천을 대상으로 훼손된 자연성을 되살리기 위하여 물리적 환경의 개선을 중심으로 진행되었으며, 저수호안의 안정성과 더불어 식생 피복율을 높이는 등 형태적 복원을 중심으로 사업이 시행되어 왔다. 이러한 자연형 하천복원사업은 하천의 생태적 특성에 대한 고려가 미비하여 하천이 생물서식처로서의 기능을 수행하는 데에는 효과적이지 못하였다. 따라서 본 연구는 하천의 물리적 특성에 의하여 형성되는 생물서식처와 이에 반응하는 생물들의 상호관련성을 분석하여 생물들에게 적합한 서식환경을 제공할 수 있는 기술을 개발하며, 하천유역의 환경 훼손으로 인하여 개체수가 급감하고 있는 종을 복원 목표종으로 선정하여 개발된 기술을 현장에 시범적으로 적용하여 기술을 검증함으로써 생물을 중심으로 한 자연과 함께하는 하천복원기술을 개발하고자 하였다. 이를 위해 우리나라의 중소하천을 대표할 수 있는 대표하천을 선정하여, 서식처 유형을 분류하고 각 서식처의 물리 화학 생태특성을 분석하였으며, 이를 통해 납자루아과 어종의 산란처 및 서식처로서의 역할을 하는 개방형 하도습지 조성 기술을 개발 및 시범적용 하였다.

  • PDF