Search | Korea Science

Comparison Between Optimal Features of Korean and Chinese for Text Classification (한중 자동 문서분류를 위한 최적 자질어 비교)

Ren, Mei-Ying;Kang, Sinjae
- Journal of the Korean Institute of Intelligent Systems
- /
- v.25 no.4
- /
- pp.386-391
- /
- 2015
This paper proposed the optimal attributes for text classification based on Korean and Chinese linguistic features. The experiments committed to discover which is the best feature among n-grams which is known as language independent, morphemes that have language dependency and some other feature sets consisted with n-grams and morphemes showed best results. This paper used SVM classifier and Internet news for text classification. As a result, bi-gram was the best feature in Korean text categorization with the highest F1-Measure of 87.07%, and for Chinese document classification, 'uni-gram+noun+verb+adjective+idiom', which is the combined feature set, showed the best performance with the highest F1-Measure of 82.79%.
https://doi.org/10.5391/JKIIS.2015.25.4.386 인용 PDF KSCI

Image Sequence Compression based on Adaptive Classification of Interframe Difference Image Blocks (프레임간 차영상 블록의 적응분류에 의한 영상시퀀스 압축)

Ahn, Chul-Joon;Kong, Seong-Gon
- Journal of the Korean Institute of Intelligent Systems
- /
- v.8 no.6
- /
- pp.122-128
- /
- 1998
This paper presents compression of image sequences based on the classification of interframe difference image blocks. classification process consists of image activity classification and energy distribution classification. In the activity classification, interframe difference image blocks are classified into activity blocks and non-activity blocks using the edge detection. In the distribution classification, activity blocks are further classified into vertical blocks, horizontal blocks, and small activity blocks using the AC energy distribution features. The RBFN, trained with numerical classification results, successfully classifies difference image blocks according to image details. Image sequence compressing based on the classification of interframe difference image blocks using the RBFN shows better compression results and less training time than the classical sorting method and the MLP network.
PDF

A Study on Clustering Algorithm Using Design Pattern Structure (디자인 패턴 구조를 이용한 클러스터링에 관한 연구)

한정수;김귀정
- The Journal of the Korea Contents Association
- /
- v.2 no.1
- /
- pp.68-76
- /
- 2002
Clustering is representative method of components classification. But, previous clustering method that use cohesion and coupling can not be effective, because design pattern has consisted by relation between classes. In this paper, we classified design patterns with special quality of pattern structure. Classification by clustering had expressed higher correctness degree than classification by facet. Therefore, can do that it is effective that classify design patterns using clustering algorithms that is automatic classification method. When we are searching design patterns, classification of design patterns can compare and analyze similar patterns because similar patterns is saved to same category. Also we can manage repository efficiently because of using and storing link information of patterns.
PDF

An Example-based Korean Standard Industrial and Occupational Code Classification (예제기반 한국어 표준 산업/직업 코드 분류)

Lim Heui-Seok
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.7 no.4
- /
- pp.594-601
- /
- 2006
Coding of occupational and industrial codes is a major operation in census survey of Korean statistics bureau. The coding process has been done manually. Such manual work is very labor and cost intensive and it usually causes inconsistent results. This paper proposes an automatic coding system based on example-based learning. The system converts natural language input into corresponding numeric codes using code generation system trained by example-based teaming after applying manually built rules. As experimental results performed with training data consisted of 400,000 records and 260 manual rules, the proposed system showed about 76.69% and 99.68% accuracy for occupational code classification and industrial code classification, respectively.
PDF

ILD Vehicle Classification Algorithm using Neural Networks (신경망을 이용한 루프검지기 차종분류 알고리즘)

Ki Yong-Kul;Baik Doo-Kwon
- Journal of KIISE:Software and Applications
- /
- v.33 no.5
- /
- pp.489-498
- /
- 2006
In this paper, we suggested a vehicle classification algorithm using pattern recognition method. At present, Inductive Loop Detector is rarely used for vehicle classification because of its low accuracy. To improve the accuracy, we suggest a new algorithm for Loop Detector using neural networks. In the developed algorithm, the inputs to the neural networks are the variation rate of frequency and occupancy-time. The output is classified vehicles. The developed algorithm was assessed at test sites and the recognition rate was 91.3percent. The results verified that the proposed algorithm improves the vehicle classification accuracy compared to the conventional method based on Loop Detector.
PDF KSCI

Improving Multinomial Naive Bayes Text Classifier (다항시행접근 단순 베이지안 문서분류기의 개선)

김상범;임해창
- Journal of KIISE:Software and Applications
- /
- v.30 no.3_4
- /
- pp.259-267
- /
- 2003
Though naive Bayes text classifiers are widely used because of its simplicity, the techniques for improving performances of these classifiers have been rarely studied. In this paper, we propose and evaluate some general and effective techniques for improving performance of the naive Bayes text classifier. We suggest document model based parameter estimation and document length normalization to alleviate the Problems in the traditional multinomial approach for text classification. In addition, Mutual-Information-weighted naive Bayes text classifier is proposed to increase the effect of highly informative words. Our techniques are evaluated on the Reuters21578 and 20 Newsgroups collections, and significant improvements are obtained over the existing multinomial naive Bayes approach.
PDF KSCI

Classification Techniques for XML Document Using Text Mining (텍스트 마이닝을 이용한 XML 문서 분류 기술)

Kim Cheon-Shik;Hong You-Sik
- Journal of the Korea Society of Computer and Information
- /
- v.11 no.2 s.40
- /
- pp.15-23
- /
- 2006
Millions of documents are already on the Internet, and new documents are being formed all the time. This poses a very important problem in the management and querying of documents to classify them on the Internet by the most suitable means. However, most users have been using the document classification method based on a keyword. This method does not classify documents efficiently, and there is a weakness in the category of document that includes meaning. Document classification by a person can be very correct sometimes and often times is required. Therefore, in this paper, We wish to classify documents by using a neural network algorithm and C4.5 algorithms. We used resume data forming by XML for a document classification experiment. The result showed excellent possibilities in the document category. Therefore, We expect an applicable solution for various document classification problems.
PDF

Feature Selection for Traffic Classification in SDN (SDN환경에서 트래픽 분류를 위한 특징 선택 기법)

Lim, Hwan-Hee;Kim, Dong-Hyun;Lee, Byung-Jun;Kim, Kyung-tae;Youn, Hee-Yong
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2018.07a
- /
- pp.43-44
- /
- 2018
본 논문에서는 SDN환경에서 트래픽 분류를 위한 특징 선택 기법을 제안한다. 최근 들어 다양한 스마트폰 어플리케이션이나 다양한 IoT 기기들이 등장하고 있다. 다양한 IoT 기기나 어플리케이션은 엄청난 양의 트래픽을 발생시킨다. 엄청난 양의 트래픽은 전송속도를 저하시킬 뿐만 아니라, 높은 Quality of Service(QoS)를 보장하기가 힘들다. 또한 이러한 문제점들을 보안하기 위해 Software Defined Networking(SDN)이라는 기술이 빠른속도로 발전하고 있으며, 본 논문에서는 다양한 어플리케이션과 IoT 기기의 트래픽을 SDN환경에서 분류 하며, 트래픽의 분류 정확도와 더 빠른 분류를 위해 트래픽의 특징 선택 기법을 제안한다. 특징 선택을 수행한 뒤, 트래픽 분류를 진행하면 트래픽 분류 정확도를 높이고, 트래픽 분류시간은 줄어드는 효과를 보이며, 높은 QoS를 보장함으로써, 기존 네트워크 트래픽의 부하가 줄어드는 우수한 성능을 보인다.
PDF

Hierarchical Text Categorization using Support Vector Machine (지지 벡터 기계를 이용한 계층적 문서 분류)

Yoon, Yong-Wook;Lee, Chang-Ki;Lee, Gary Geun-Bae
- Annual Conference on Human and Language Technology
- /
- 2003.10d
- /
- pp.7-13
- /
- 2003
인터넷을 통해 생성, 전달되는 문서 량이 급격히 많아짐에 따라, 정보의 접근을 용이하게 하기 위한 문서의 자동 분류 기능이 절실히 요구되고 있다. SVM(Support Vector Machine)은 최근에 문서 분류에 널리 쓰이고 있는 기법으로 다른 분류기에 비하여 좋은 성능을 보여주고 있다. 하지만 SVM은 현재까지 주로 비 계층 평탄화(flat)된 분류 응용에 효과적으로 적용되어 왔다. 이와 달리 본 논문은 문서 분류에 있어서 최종 분류 class를 한번에 출력하는 비 계층 분류보다는, 비슷한 성질을 갖는 class의 집합을 계층적 구조로 묶어 분류하는 계층적 분류 기법이 보다 사람이 이해하기 쉽고 사용하기 편리하며 더 효과적이라는 것을 보이고, 실험을 통해 계층적 분류를 위한 효과적인 SVM분류기를 개발하여 비 계층 분류보다 좋은 분류 성능을 보여 줄 수 있음을 확인한다.
PDF

Automatic e-mail Hierarchy Classification using Dynamic Category Hierarchy and Principal Component Analysis (PCA와 동적 분류체계를 사용한 자동 이메일 계층 분류)

Park, Sun
- Journal of Advanced Navigation Technology
- /
- v.13 no.3
- /
- pp.419-425
- /
- 2009
The amount of incoming e-mails is increasing rapidly due to the wide usage of Internet. Therefore, it is more required to classify incoming e-mails efficiently and accurately. Currently, the e-mail classification techniques are focused on two way classification to filter spam mails from normal ones based mainly on Bayesian and Rule. The clustering method has been used for the multi-way classification of e-mails. But it has a disadvantage of low accuracy of classification and no category labels. The classification methods have a disadvantage of training and setting of category labels by user. In this paper, we propose a novel multi-way e-mail hierarchy classification method that uses PCA for automatic category generation and dynamic category hierarchy for high accuracy of classification. It classifies a huge amount of incoming e-mails automatically, efficiently, and accurately.
PDF

Search Result 12,526, Processing Time 0.038 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)