Search | Korea Science

Korean Parsing using Machine Learning Techniques (기계학습 기법을 이용한 한국어 구문분석)

Lee, Yong-Hun;Lee, Jong-Hyeok
- Proceedings of the Korean Information Science Society Conference
- /
- 2008.06c
- /
- pp.285-288
- /
- 2008
최근의 구문분석 연구는 컴퓨터 성능 향상과 사용 가능한 대량의 구문분석 말뭉치 증가, 견고한 기계학습 기법 개발 등에 힘입어 통계적인 모델 연구가 꾸준히 증가하고 있다. 본 논문에서는 기존에 개발된 다양한 기계학습 기법 중 ME(Maximum Entropy) 모델과 SVM(Support vector machine) 모델을 이용한 한국어 구문분석 방법을 제안한다. 국어정보베이스(KIBS) 구문분석 말뭉치를 가지고 실험한 결과 SVM 모델을 이용한 한국어 구문분석기가 기존의 확률 기반 통계적 한국어 구문분석기의 성능보다도 최대 1.84% 높은 87.46%의 의존관계 결정 정확률을 보였다. 추후 언어지식을 반영한 다양한 자질들을 이용할 경우 성능 향상이 기대된다.
PDF

An Automatic Spam e-mail Filter System Using χ² Statistics and Support Vector Machines (카이 제곱 통계량과 지지벡터기계를 이용한 자동 스팸 메일 분류기)

Lee, Songwook
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2009.05a
- /
- pp.592-595
- /
- 2009
We propose an automatic spam mail classifier for e-mail data using Support Vector Machines (SVM). We use a lexical form of a word and its part of speech (POS) tags as features. We select useful features with ${\chi}^2$ statistics and represent each feature using text frequency (TF) and inversed document frequency (IDF) values for each feature. After training SVM with the features, SVM classifies each email as spam mail or not. In experiment, we acquired 82.7% of accuracy with e-mail data collected from a web mail system.
PDF

An analysis of Speech Acts for Korean Using Support Vector Machines (지지벡터기계(Support Vector Machines)를 이용한 한국어 화행분석)

En Jongmin;Lee Songwook;Seo Jungyun
- The KIPS Transactions:PartB
- /
- v.12B no.3 s.99
- /
- pp.365-368
- /
- 2005
We propose a speech act analysis method for Korean dialogue using Support Vector Machines (SVM). We use a lexical form of a word, its part of speech (POS) tags, and bigrams of POS tags as sentence features and the contexts of the previous utterance as context features. We select informative features by Chi square statistics. After training SVM with the selected features, SVM classifiers determine the speech act of each utterance. In experiment, we acquired overall $90.54\%$ of accuracy with dialogue corpus for hotel reservation domain.
https://doi.org/10.3745/KIPSTB.2005.12B.3.365 인용 PDF KSCI

Design Neural Machine Translation Model Combining External Symbolic Knowledge (심볼릭 지식 정보를 결합한 뉴럴기계번역 모델 설계)

Eo, Sugyeong;Park, Chanjun;Lim, Heuiseok
- Annual Conference on Human and Language Technology
- /
- 2020.10a
- /
- pp.529-534
- /
- 2020
인공신경망 기반 기계번역(Neural Machine Translation, NMT)이란 딥러닝(Deep learning)을 이용하여 출발 언어의 문장을 도착 언어 문장으로 번역해주는 시스템을 일컫는다. NMT는 종단간 학습(end-to-end learning)을 이용하여 기존 기계번역 방법론의 성능을 앞지르며 기계번역의 주요 방법론으로 자리잡게 됐다. 이러한 발전에도 불구하고 여전히 개체(entity), 또는 전문 용어(terminological expressions)의 번역은 미해결 과제로 남아있다. 개체나 전문 용어는 대부분 명사로 구성되는데 문장 내 명사는 주체, 객체 등의 역할을 하는 중요한 요소이므로 이들의 정확한 번역이 문장 전체의 번역 성능 향상으로 이어질 수 있다. 따라서 본 논문에서는 지식그래프(Knowledge Graph)를 이용하여 심볼릭 지식을 NMT와 결합한 뉴럴심볼릭 방법론을 제안한다. 또한 지식그래프를 활용하여 NMT의 성능을 높인 선행 연구 방법론을 한영 기계번역에 이용할 수 있도록 구조를 설계한다.
PDF

Machine-Learning Based Biomedical Term Recognition (기계학습에 기반한 생의학분야 전문용어의 자동인식)

Oh Jong-Hoon;Choi Key-Sun
- Journal of KIISE:Software and Applications
- /
- v.33 no.8
- /
- pp.718-729
- /
- 2006
There has been increasing interest in automatic term recognition (ATR), which recognizes technical terms for given domain specific texts. ATR is composed of 'term extraction', which extracts candidates of technical terms and 'term selection' which decides whether terms in a term list derived from 'term extraction' are technical terms or not. 'term selection' is a process to rank a term list depending on features of technical term and to find the boundary between technical term and general term. The previous works just use statistical features of terms for 'term selection'. However, there are limitations on effectively selecting technical terms among a term list using the statistical feature. The objective of this paper is to find effective features for 'term selection' by considering various aspects of technical terms. In order to solve the ranking problem, we derive various features of technical terms and combine the features using machine-learning algorithms. For solving the boundary finding problem, we define it as a binary classification problem which classifies a term in a term list into technical term and general term. Experiments show that our method records 78-86% precision and 87%-90% recall in boundary finding, and 89%-92% 11-point precision in ranking. Moreover, our method shows higher performance than the previous work's about 26% in maximum.
PDF KSCI

Development of Street Crossing Assistive Embedded System for the Visually-Impaired Using Machine Learning Algorithm (머신러닝을 이용한 시각장애인 도로 횡단 보조 임베디드 시스템 개발)

Oh, SeonTaek;Jeong, Kidong;Kim, Homin;Kim, Young-Keun
- Journal of the HCI Society of Korea
- /
- v.14 no.2
- /
- pp.41-47
- /
- 2019
In this study, a smart assistive device is designed to recognize pedestrian signal and to provide audio instructions for visually impaired people in crossing streets safely. Walking alone is one of the biggest challenges to the visually impaired and it deteriorates their life quality. The proposed device has a camera attached on a pair of glasses which can detect traffic lights, recognize pedestrian signals in real-time using a machine learning algorithm on GPU board and provide audio instructions to the user. For the portability, the dimension of the device is designed to be compact and light but with sufficient battery life. The embedded processor of device is wired to the small camera which is attached on a pair of glasses. Also, on inner part of the leg of the glasses, a bone-conduction speaker is installed which can give audio instructions without blocking external sounds for safety reason. The performance of the proposed device was validated with experiments and it showed 87.0% recall and 100% precision for detecting pedestrian green light, and 94.4% recall and 97.1% precision for detecting pedestrian red light.
PDF KSCI

Machine Learning-based Multiple Fault Localization with Bayesian Probability (베이지안 확률을 적용한 기계학습 기반 다중 결함 위치 식별 기법)

Song, Jihyoun;Kim, Jeongho;Lee, Eunseok
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2017.01a
- /
- pp.151-154
- /
- 2017
소프트웨어의 개발과정 중 결함을 제거하는 작업인 디버깅을 위해서는 가장 먼저 그 결함의 정확한 위치를 찾아야한다. 이 작업은 많은 시간이 소요되며, 이 시간을 단축시키기 위한 결함 위치 식별 기법들이 소개되었다. 많은 기법들 중 프로그램 커버리지 정보를 학습하여 규칙을 분석하는 인공신경망 기반 선행 연구가 있다. 이를 기반으로 본 논문에서는 문장들 간의 관계를 추가적으로 파악하여 학습 데이터로 사용하는 기법을 제안한다. 특정 문장이 항상 지나는 테스트케이스들 중 나머지 다른 문장들이 지나는 테스트케이스의 비율을 통해 문장들 간의 관계를 나타낸다. 해당 비율을 계산하기 위해 조건부 확률인 베이지안 확률을 사용한다. 베이지안 확률을 통해 얻은 문장들의 관계에 따라 인공신경망 내에서 의심도를 결정하는 웨이트(weight)가 기존 기법과는 다르게 학습된다. 이 차이는 문장들의 의심도를 조정하며, 결과적으로 다중 결함 위치 식별의 정확도를 향상시킨다. 본 논문에서 제안한 기법을 이용하여 실험한 결과, Tarantula 대비 평균 39.8%, 기존 역전파 인공신경망(BPNN) 기반 기법 대비 평균 60.5%의 정확도 향상이 있었음을 확인할 수 있다.
PDF

An Accelerated Iterative Method for the Dynamic Analysis of Multibody Systems (반복 계산법 및 계산 가속기법에 의한 다물체 동역학 해법)

이기수;임철호
- Transactions of the Korean Society of Mechanical Engineers
- /
- v.16 no.5
- /
- pp.899-909
- /
- 1992
An iterative solution technique is presented to analyze the dynamic systems of rigid bodies subjected to kinematic constraints. Lagrange multipliers associated with the constraints are iteratively computed by monotonically reducing an appropriately defined constraint error vector, and the resulting equation of motion is solved by a well-established ODE technique. Constraints on the velocity and acceleration as well as the position are made to be satisfied at joints at each time step. Time integration is efficiently performed because decomposition or orthonormalization of the large matrix is not required at all. An acceleration technique is suggested for the faster convergence of the iterative scheme.
https://doi.org/10.22634/KSME.1992.16.5.899 인용 PDF

Inverse Dynamic Analysis of Spatial Mechanical Systems with Euler Parameters (Euler 매개변수 를 이용한 3차원 기계시스템 의 역동력학 해석)

심정수;이종원;유영면
- Transactions of the Korean Society of Mechanical Engineers
- /
- v.9 no.5
- /
- pp.683-690
- /
- 1985
본 논문에는 Euler매개변수를 회전좌표계로 사용하여 구속된 3차원 기계시스템의 역동학력 해 을 수행한 연구결과가 수록되었다. 해석을 위해 문제에 등장하는 비선형 Holonomic구속조건식 들과 운동방정식들을 Cartesian일반좌표계을 사용하여 표시하였으며, 일반좌표계를 구성하는 각 강체의 좌표계로는 변위를 나타내기 위한 3개의 좌표와 회전을 나타내기 위한 4개의 Euler매 개변수가 사용되었다. 구속조건식들과 미분방정식 형태의 운동방정식들을 결합하여 시스템 전 체의 운동방정식을 유도하기 위해 Lagrange승수 기법을 사용하였다. 각 강체의 주어진 시간에 서의 위치, 속도, 가속도는 기구학적 해석(kinematic analysis)을 통해 얻어지고, 이 자료들을 전 체운동방정식에 대입하여 Lagrnage승수의 값을 계산하여 6개의 자유도를 가진 로봇 기구를 원 하는대로 운전하는에 필요한 각 관절의 토오크를 계산하였으며, 계산결과가 정확하다는 사실이 입증되었다. 연구결과 Euler매개변수를 회전좌표로 사용할 경우 특이 경우(singular case)가 발 생하지 않으며, 이 방법은 역동력학 해석용 다목적 전산프로그램 개발에 광범위하게 응용될 수 있음이 밝혀졌다.
https://doi.org/10.22634/KSME.1985.9.5.683 인용 PDF

Machine Learning-based Stroke Risk Prediction using Public Big Data (공공빅데이터를 활용한 기계학습 기반 뇌졸중 위험도 예측)

Jeong, Sunwoo;Lee, Minji;Yoo, Sunyong
- Journal of Advanced Navigation Technology
- /
- v.25 no.1
- /
- pp.96-101
- /
- 2021
This paper presents a machine learning model that predicts stroke risks in atrial fibrillation patients using public big data. As the training data, 68 independent variables including demographic, medical history, health examination were collected from the Korean National Health Insurance Service. To predict stroke incidence in patients with atrial fibrillation, we applied deep neural network. We firstly verify the performance of conventional statistical models (CHADS2, CHA2DS2-VASc). Then we compared proposed model with the statistical models for various hyperparameters. Accuracy and area under the receiver operating characteristic (AUROC) were mainly used as indicators for performance evaluation. As a result, the model using batch normalization showed the highest performance, which recorded better performance than the statistical model.
https://doi.org/10.12673/jant.2021.25.1.96 인용 PDF KSCI

Search Result 364, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)