Search | Korea Science

A Study on automatic assignment of descriptors using machine learning (기계학습을 통한 디스크립터 자동부여에 관한 연구)

Kim, Pan-Jun
- Journal of the Korean Society for information Management
- /
- v.23 no.1 s.59
- /
- pp.279-299
- /
- 2006
This study utilizes various approaches of machine learning in the process of automatically assigning descriptors to journal articles. The effectiveness of feature selection and the size of training set were examined, after selecting core journals in the field of information science and organizing test collection from the articles of the past 11 years. Regarding feature selection, after reducing the feature set using $x^2$ statistics(CHI) and criteria that prefer high-frequency features(COS, GSS, JAC), the trained Support Vector Machines(SVM) performed the best. With respect to the size of the training set, it significantly influenced the performance of Support Vector Machines(SVM) and Voted Perceptron(VTP). However, it had little effect on Naive Bayes(NB).
https://doi.org/10.3743/KOSIM.2006.23.1.279 인용 PDF

Research of Synthetic Resonance Characteristics for Electrohydraulic Thrust Vector Control Actuation System (전기-유압식 추력벡터제어 구동장치시스템의 합성공진 특성 연구)

Min, Byeong-Joo;Choi, Hyung-Don;Kang, E-Sok
- Aerospace Engineering and Technology
- /
- v.7 no.1
- /
- pp.151-160
- /
- 2008
In this paper, the analysis results of synthetic resonance characteristics are described for the electrohydraulic thrust vector control actuation system. The synthetic resonance is induced by integration of position servo actuation system on the flexible launch vehicle mounting structure. The new resonance mode is synthesized due to composition of hydraulic resonance for electrohydraulic position servo system with inertia load condition and structural resonance for flexible mounting structure. This synthetic resonance can make stability of control system worse by feedback and amplification of control system. The exact nonlinear analysis model of this phenomenon is developed to predict and design a control algorithm for improvement characteristics. The DPF (Dynamic Pressure Feedback) control algorithm has been designed and has excellent resonance suppression capability.
PDF

A Topic Classification System in cQA Services Based on Semi-Automatic Learning Using Wikipedia (위키피디아를 이용한 반자동 학습 기반의 cQA 서비스 주제 분류 시스템)

Kim, Taehyun
- Annual Conference on Human and Language Technology
- /
- 2015.10a
- /
- pp.139-141
- /
- 2015
본 논문은 커뮤니티 기반의 질의-응답 서비스에서 사용자 질의의 주제를 분류하는 시스템을 소개한다. 커뮤니티 기반의 질의-응답 서비스는 분야에 따라 다양한 주제를 가질 수 있으며 오늘 날 사용자 질의의 주제 분류에는 통계 기반의 분류 방법이 많이 이용되고 있다. 통계 기반의 분류 방법으로 사용자 질의를 분류하기 위해서는 주제에 적합한 대량의 학습 말뭉치가 필요하다. 주제에 적합한 대량의 학습 말뭉치를 사람이 직접 구축하는 것은 많은 시간과 비용이 든다. 따라서 본 논문에서는 이러한 문제를 해결하기 위해 위키피디아 문서를 Supervised K-means Clustering 기법으로 주제별로 분류함으로써 학습 말뭉치를 반자동으로 구축하는 방법을 제안한다. 그 다음, 생성된 학습 말뭉치로 지지 벡터 기계를 학습하여 사용자 질의의 주제를 분류하게 된다. 위키피디아 문서와 사용자 질의는 다른 도메인의 문서임에도 불구하고 본 논문의 시스템으로 사용자 질의의 주제를 분류한 결과 77.33%의 정확도를 보였다.
PDF

Improving the Performance of SVM Text Categorization with Inter-document Similarities (문헌간 유사도를 이용한 SVM 분류기의 문헌분류성능 향상에 관한 연구)

Lee, Jae-Yun
- Journal of the Korean Society for information Management
- /
- v.22 no.3 s.57
- /
- pp.261-287
- /
- 2005
The purpose of this paper is to explore the ways to improve the performance of SVM (Support Vector Machines) text classifier using inter-document similarities. SVMs are powerful machine learning systems, which are considered as the state-of-the-art technique for automatic document classification. In this paper text categorization via SVMs approach based on feature representation with document vectors is suggested. In this approach, document vectors instead of index terms are used as features, and vector similarities instead of term weights are used as feature values. Experiments show that SVM classifier with document vector features can improve the document classification performance. For the sake of run-time efficiency, two methods are developed: One is to select document vector features, and the other is to use category centroid vector features instead. Experiments on these two methods show that we can get improved performance with small vector feature set than the performance of conventional methods with index term features.
https://doi.org/10.3743/KOSIM.2005.22.3.261 인용 PDF

Performance Comparison of Machine Learning Algorithms for Malware Detection (악성코드 탐지를 위한 기계학습 알고리즘의 성능 비교)

Lee, Hyun-Jong;Heo, Jae Hyeok;Hwang, Doosung
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2018.01a
- /
- pp.143-146
- /
- 2018
서명기반 악성코드 탐지는 악성 파일의 고유 해싱 값을 사용하거나 패턴화된 공격 규칙을 이용하므로, 변형된 악성코드 탐지에 취약한 단점이 있다. 기계 학습을 적용한 악성코드 탐지는 이러한 취약점을 극복할 수 있는 방안으로 인식되고 있다. 본 논문은 정적 분석으로 n-gram과 API 특징점을 추출해 특징 벡터로 구성하여 XGBoost, k-최근접 이웃 알고리즘, 지지 벡터 기기, 신경망 알고리즘, 심층 학습 알고리즘의 일반화 성능을 비교한다. 실험 결과로 XGBoost가 일반화 성능이 99%로 가장 우수했으며 k-최근접 이웃 알고리즘이 학습 시간이 가장 적게 소요됐다. 일반화 성능과 시간 복잡도 측면에서 XGBoost가 비교 대상 알고리즘에 비해 우수한 성능을 보였다.
PDF

Recognition of Superimposed Patterns with Selective Attention based on SVM (SVM기반의 선택적 주의집중을 이용한 중첩 패턴 인식)

Bae, Kyu-Chan;Park, Hyung-Min;Oh, Sang-Hoon;Choi, Youg-Sun;Lee, Soo-Young
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.5 s.305
- /
- pp.123-136
- /
- 2005
We propose a recognition system for superimposed patterns based on selective attention model and SVM which produces better performance than artificial neural network. The proposed selective attention model includes attention layer prior to SVM which affects SVM's input parameters. It also behaves as selective filter. The philosophy behind selective attention model is to find the stopping criteria to stop training and also defines the confidence measure of the selective attention's outcome. Support vector represents the other surrounding sample vectors. The support vector closest to the initial input vector in consideration is chosen. Minimal euclidean distance between the modified input vector based on selective attention and the chosen support vector defines the stopping criteria. It is difficult to define the confidence measure of selective attention if we apply common selective attention model, A new way of doffing the confidence measure can be set under the constraint that each modified input pixel does not cross over the boundary of original input pixel, thus the range of applicable information get increased. This method uses the following information; the Euclidean distance between an input pattern and modified pattern, the output of SVM, the support vector output of hidden neuron that is the closest to the initial input pattern. For the recognition experiment, 45 different combinations of USPS digit data are used. Better recognition performance is seen when selective attention is applied along with SVM than SVM only. Also, the proposed selective attention shows better performance than common selective attention.
PDF KSCI

Text Categorization Based on Terminology and Information Extraction (전문용어 및 정보추출에 기반한 문서분류시스템)

Lee, Kyung-Soon;Choi, Key-Sun
- Annual Conference on Human and Language Technology
- /
- 1999.10e
- /
- pp.79-84
- /
- 1999
본 연구에서는 문서분류시스템에서 자질의 표현으로 전문분야사전을 이용한 분야정보와 개체정보추출을 통한 개체정보를 이용한다. 또한 지식정보를 보완하기 위해 통계적인 방법으로 범주 전문용어를 인식하여 자질로 표현하는 방법을 제안한다. 문서에 나타난 용어들이 어떤 특정 전문분야에 속하는 용어들이 많이 나타나는 경우 그 문서는 용어들이 속한 분야의 문서일 가능성이 높다. 또한, 정보추출을 통해 용어가 어떠한 개체를 나타내는지를 인식하여 문서를 표현함으로써 문서가 내포하는 의미를 보다 잘 반영할 수 있게 된다. 분야정보나 개체정보를 알 수 없는 용어에 대해서는 학습문서로부터 전문분야를 자동 인식함으로써 문서표현의 지식정보를 보완한다. 전문분야, 개체정보 및 범주전문용어에 기반해서 표현된 문서의 자질에 대해서 지지벡터기계 학습에 기반한 문서분류기틀 이용하여 각 범주에 대해 이진분류를 하였다. 제안된 문서자질표현은 용어기반의 자질표현에 비해 좋은 성능을 보이고 있다.
PDF

A Korean Emotion Features Extraction Method and Their Availability Evaluation for Sentiment Classification (감정 분류를 위한 한국어 감정 자질 추출 기법과 감정 자질의 유용성 평가)

Hwang, Jae-Won;Ko, Young-Joong
- Korean Journal of Cognitive Science
- /
- v.19 no.4
- /
- pp.499-517
- /
- 2008
In this paper, we propose an effective emotion feature extraction method for Korean and evaluate their availability in sentiment classification. Korean emotion features are expanded from several representative emotion words and they play an important role in building in an effective sentiment classification system. Firstly, synonym information of English word thesaurus is used to extract effective emotion features and then the extracted English emotion features are translated into Korean. To evaluate the extracted Korean emotion features, we represent each document using the extracted features and classify it using SVM(Support Vector Machine). In experimental results, the sentiment classification system using the extracted Korean emotion features obtained more improved performance(14.1%) than the system using content-words based features which have generally used in common text classification systems.
PDF

An Experimental Study on the Relation Extraction from Biomedical Abstracts using Machine Learning (기계 학습을 이용한 바이오 분야 학술 문헌에서의 관계 추출에 대한 실험적 연구)

Choi, Sung-Pil
- Journal of the Korean Society for Library and Information Science
- /
- v.50 no.2
- /
- pp.309-336
- /
- 2016
This paper introduces a relation extraction system that can be used in identifying and classifying semantic relations between biomedical entities in scientific texts using machine learning methods such as Support Vector Machines (SVM). The suggested system includes many useful functions capable of extracting various linguistic features from sentences having a pair of biomedical entities and applying them into training relation extraction models for maximizing their performance. Three globally representative collections in biomedical domains were used in the experiments which demonstrate its superiority in various biomedical domains. As a result, it is most likely that the intensive experimental study conducted in this paper will provide meaningful foundations for research on bio-text analysis based on machine learning.
https://doi.org/10.4275/KSLIS.2016.50.2.309 인용 PDF KSCI

Tor Network Website Fingerprinting Using Statistical-Based Feature and Ensemble Learning of Traffic Data (트래픽 데이터의 통계적 기반 특징과 앙상블 학습을 이용한 토르 네트워크 웹사이트 핑거프린팅)

Kim, Junho;Kim, Wongyum;Hwang, Doosung
- KIPS Transactions on Software and Data Engineering
- /
- v.9 no.6
- /
- pp.187-194
- /
- 2020
This paper proposes a website fingerprinting method using ensemble learning over a Tor network that guarantees client anonymity and personal information. We construct a training problem for website fingerprinting from the traffic packets collected in the Tor network, and compare the performance of the website fingerprinting system using tree-based ensemble models. A training feature vector is prepared from the general information, burst, cell sequence length, and cell order that are extracted from the traffic sequence, and the features of each website are represented with a fixed length. For experimental evaluation, we define four learning problems (Wang14, BW, CW_T, CW_H) according to the use of website fingerprinting, and compare the performance with the support vector machine model using CUMUL feature vectors. In the experimental evaluation, the proposed statistical-based training feature representation is superior to the CUMUL feature representation except for the BW case.
https://doi.org/10.3745/KTSDE.2020.9.6.187 인용 PDF KSCI

Search Result 62, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)