Search | Korea Science

A Study on the Storage Requirement and Incremental Learning of the k-NN Classifier (K_NN 분류기의 메모리 사용과 점진적 학습에 대한 연구)

이형일;윤충화
- The Journal of Information Technology
- /
- v.1 no.1
- /
- pp.65-84
- /
- 1998
The MBR (Memory Based Reasoning) is a supervised learning method that utilizes the distances among the input and trained patterns in its classification, and is also called a distance based learning algorithm. The MBR is based on the k-NN classifier, in which teaming is performed by simply storing training patterns in the memory without any further processing. This paper proposes a new learning algorithm which is more efficient than the traditional k-NN classifier and has incremental learning capability, Furthermore, our proposed algorithm is insensitive to noisy patterns, and guarantees more efficient memory usage.
PDF

An Incremental Multi Partition Averaging Algorithm Based on Memory Based Reasoning (메모리 기반 추론 기법에 기반한 점진적 다분할평균 알고리즘)

Yih, Hyeong-Il
- Journal of IKEEE
- /
- v.12 no.1
- /
- pp.65-74
- /
- 2008
One of the popular methods used for pattern classification is the MBR (Memory-Based Reasoning) algorithm. Since it simply computes distances between a test pattern and training patterns or hyperplanes stored in memory, and then assigns the class of the nearest training pattern, it is notorious for memory usage and can't learn additional information from new data. In order to overcome this problem, we propose an incremental learning algorithm (iMPA). iMPA divides the entire pattern space into fixed number partitions, and generates representatives from each partition. Also, due to the fact that it can not learn additional information from new data, we present iMPA which can learn additional information from new data and not require access to the original data, used to train. Proposed methods have been successfully shown to exhibit comparable performance to k-NN with a lot less number of patterns and better result than EACH system which implements the NGE theory using benchmark data sets from UCI Machine Learning Repository.
PDF

A Representative Pattern Generation Algorithm Based on Evaluation And Selection (평가와 선택기법에 기반한 대표패턴 생성 알고리즘)

Yih, Hyeong-Il
- Journal of the Korea Society of Computer and Information
- /
- v.14 no.3
- /
- pp.139-147
- /
- 2009
The memory based reasoning just stores in the memory in the form of the training pattern of the representative pattern. And it classifies through the distance calculation with the test pattern. Because it uses the techniques which stores the training pattern whole in the memory or in which it replaces training patterns with the representative pattern. Due to this, the memory in which it is a lot for the other machine learning techniques is required. And as the moreover stored training pattern increases, the time required for a classification is very much required. In this paper, We propose the EAS(Evaluation And Selection) algorithm in order to minimize memory usage and to improve classification performance. After partitioning the training space, this evaluates each partitioned space as MDL and PM method. The partitioned space in which the evaluation result is most excellent makes into the representative pattern. Remainder partitioned spaces again partitions and repeat the evaluation. We verify the performance of Proposed algorithm using benchmark data sets from UCI Machine Learning Repository.
https://doi.org/10.9708/jksci.2009.14.3.139 인용 PDF

Distributed In-Memory Caching Method for ML Workload in Kubernetes (쿠버네티스에서 ML 워크로드를 위한 분산 인-메모리 캐싱 방법)

Dong-Hyeon Youn;Seokil Song
- Journal of Platform Technology
- /
- v.11 no.4
- /
- pp.71-79
- /
- 2023
In this paper, we analyze the characteristics of machine learning workloads and, based on them, propose a distributed in-memory caching technique to improve the performance of machine learning workloads. The core of machine learning workload is model training, and model training is a computationally intensive task. Performing machine learning workloads in a Kubernetes-based cloud environment in which the computing framework and storage are separated can effectively allocate resources, but delays can occur because IO must be performed through network communication. In this paper, we propose a distributed in-memory caching technique to improve the performance of machine learning workloads performed in such an environment. In particular, we propose a new method of precaching data required for machine learning workloads into the distributed in-memory cache by considering Kubflow pipelines, a Kubernetes-based machine learning pipeline management tool.
PDF

A Memory-based Learning using Repetitive Fixed Partitioning Averaging (반복적 고정분할 평균기법을 이용한 메모리기반 학습기법)

Yih, Hyeong-Il
- Journal of Korea Multimedia Society
- /
- v.10 no.11
- /
- pp.1516-1522
- /
- 2007
We had proposed the FPA(Fixed Partition Averaging) method in order to improve the storage requirement and classification rate of the Memory Based Reasoning. The algorithm worked not bad in many area, but it lead to some overhead for memory usage and lengthy computation in the multi classes area. We propose an Repetitive FPA algorithm which repetitively partitioning pattern space in the multi classes area. Our proposed methods have been successfully shown to exhibit comparable performance to k-NN with a lot less number of patterns and better result than EACH system which implements the NGE theory.
PDF

Expanding Rule Using Recursive Partition Averaging (RPA 기법을 이용한 규칙의 확장)

Han Jin-Chul;Kim Sang-ki;Yoon Chung-Hwa
- Proceedings of the Korea Information Processing Society Conference
- /
- 2004.11a
- /
- pp.489-492
- /
- 2004
미지의 패턴을 분류하기 위해서 사용되는 메모리 기반 학습 기법은 만족할만한 분류 성능을 보여주고 있다. 하지만 메모리 기반 학습기법은 단순히 패턴과 메모리에 저장된 예제들 간의 거리를 기준으로 분류하므로, 패턴을 분류하는 처리과정을 설명할 수 없다는 문제점을 가지고 있다. 본 논문에서는 RPA(Recursive Partition Averaging) 기법을 이용하여 패턴을 분류하는 과정을 설명할 수 있는 규칙 추출 알고리즘과 또한 일반화 성능을 향상시키기 위하여 규칙의 조건을 확장하는 알고리즘을 제안한다.
PDF

Korean Sentence Boundary Detection Using Memory-based Machine Learning (메모리 기반의 기계 학습을 이용한 한국어 문장 경계 인식)

Han Kun-Heui;Lim Heui-Seok
- The Journal of the Korea Contents Association
- /
- v.4 no.4
- /
- pp.133-139
- /
- 2004
This paper proposes a Korean sentence boundary detection system which employs k-nearest neighbor algorithm. We proposed three scoring functions to classify sentence boundary and performed comparative analysis. We uses domain independent linguistic features in order to make a general and robust system. The proposed system was trained and evaluated on the two kinds of corpus; ETRI corpus and KAIST corpus. As experimental results, the proposed system shows about $98.82\%$ precision and $99.09\%$ recall rate even though it was trained on relatively small corpus.
PDF

퍼지 이론을 이용한 웹기반 학습오인 진단 시스템

백현기;이현노;고영춘;하태현
- Proceedings of the Korea Society of Information Technology Applications Conference
- /
- 2004.06a
- /
- pp.15-24
- /
- 2004
본 논문은 be동사에 관한 학생들의 영어개념 이해에서 발생되는 오인을 진단할 수 잇는 학습오인 진단 시스템을 제시한다. 학습오인 진단 시스템에서 퍼지 인진 맵은 영어에 대한 학생들이 가지는 선입개념들과 오인들을 인과관계로 표현하며, 개념간의 인과관계를 기억할 수 있는 퍼지 연상 메모리를 통하여 오인의 원인들을 진단한다. 본 연구는 기존의 학습 오인을 진단하는 규칙기반 전문가 시스템의 한계성을 극복할 수 있는 새로운 방법을 제공하며, 교육분야의 다양한 영역에서 학습자들의 학습 진단을 위한 학습오인 진단 시스템으로 적용될 수 있다.
PDF

An Incremental Rule Extraction Algorithm Based on Recursive Partition Averaging (재귀적 분할 평균에 기반한 점진적 규칙 추출 알고리즘)

Han, Jin-Chul;Kim, Sang-Kwi;Yoon, Chung-Hwa
- Journal of KIISE:Software and Applications
- /
- v.34 no.1
- /
- pp.11-17
- /
- 2007
One of the popular methods used for pattern classification is the MBR (Memory-Based Reasoning) algorithm. Since it simply computes distances between a test pattern and training patterns or hyperplanes stored in memory, and then assigns the class of the nearest training pattern, it cannot explain how the classification result is obtained. In order to overcome this problem, we propose an incremental teaming algorithm based on RPA (Recursive Partition Averaging) to extract IF-THEN rules that describe regularities inherent in training patterns. But rules generated by RPA eventually show an overfitting phenomenon, because they depend too strongly on the details of given training patterns. Also RPA produces more number of rules than necessary, due to over-partitioning of the pattern space. Consequently, we present the IREA (Incremental Rule Extraction Algorithm) that overcomes overfitting problem by removing useless conditions from rules and reduces the number of rules at the same time. We verify the performance of proposed algorithm using benchmark data sets from UCI Machine Learning Repository.
PDF KSCI

A New Memory-based Learning using Dynamic Partition Averaging (동적 분할 평균을 이용한 새로운 메모리 기반 학습기법)

Yih, Hyeong-Il
- Journal of the Korean Institute of Intelligent Systems
- /
- v.18 no.4
- /
- pp.456-462
- /
- 2008
The classification is that a new data is classified into one of given classes and is one of the most generally used data mining techniques. Memory-Based Reasoning (MBR) is a reasoning method for classification problem. MBR simply keeps many patterns which are represented by original vector form of features in memory without rules for reasoning, and uses a distance function to classify a test pattern. If training patterns grows in MBR, as well as size of memory great the calculation amount for reasoning much have. NGE, FPA, and RPA methods are well-known MBR algorithms, which are proven to show satisfactory performance, but those have serious problems for memory usage and lengthy computation. In this paper, we propose DPA (Dynamic Partition Averaging) algorithm. it chooses partition points by calculating GINI-Index in the entire pattern space, and partitions the entire pattern space dynamically. If classes that are included to a partition are unique, it generates a representative pattern from partition, unless partitions relevant partitions repeatedly by same method. The proposed method has been successfully shown to exhibit comparable performance to k-NN with a lot less number of patterns and better result than EACH system which implements the NGE theory and FPA, and RPA.
https://doi.org/10.5391/JKIIS.2008.18.4.456 인용 PDF KSCI

Search Result 138, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)