Search | Korea Science

Finding the Optimal Data Classification Method Using LDA and QDA Discriminant Analysis

Kim, SeungJae;Kim, SungHwan
- Journal of Integrative Natural Science
- /
- v.13 no.4
- /
- pp.132-140
- /
- 2020
With the recent introduction of artificial intelligence (AI) technology, the use of data is rapidly increasing, and newly generated data is also rapidly increasing. In order to obtain the results to be analyzed based on these data, the first thing to do is to classify the data well. However, when classifying data, if only one classification technique belonging to the machine learning technique is applied to classify and analyze it, an error of overfitting can be accompanied. In order to reduce or minimize the problems caused by misclassification of the classification system such as overfitting, it is necessary to derive an optimal classification by comparing the results of each classification by applying several classification techniques. If you try to interpret the data with only one classification technique, you will have poor reasoning and poor predictions of results. This study seeks to find a method for optimally classifying data by looking at data from various perspectives and applying various classification techniques such as LDA and QDA, such as linear or nonlinear classification, as a process before data analysis in data analysis. In order to obtain the reliability and sophistication of statistics as a result of big data analysis, it is necessary to analyze the meaning of each variable and the correlation between the variables. If the data is classified differently from the hypothesis test from the beginning, even if the analysis is performed well, unreliable results will be obtained. In other words, prior to big data analysis, it is necessary to ensure that data is well classified to suit the purpose of analysis. This is a process that must be performed before reaching the result by analyzing the data, and it may be a method of optimal data classification.
https://doi.org/10.13160/ricns.2020.13.4.132 인용 PDF KSCI

On the Use of Modified Adaptive Nearest Neighbors for Classification (수정된 적응 최근접 방법을 활용한 판별분류방법에 대한 연구)

Maeng, Jin-Woo;Bang, Sung-Wan;Jhun, Myoung-Shic
- The Korean Journal of Applied Statistics
- /
- v.23 no.6
- /
- pp.1093-1102
- /
- 2010
Even though the k-Nearest Neighbors Classification(KNNC) is one of the popular non-parametric classification methods, it does not consider the local features and class information for each observation. In order to overcome such limitations, several methods have been developed such as Adaptive Nearest Neighbors Classification(ANNC) and Modified k-Nearest Neighbors Classification(MKNNC). In this paper, we propose the Modified Adaptive Nearest Neighbors Classification(MANNC) that employs the advantages of both the ANNC and MKNNC. Through a real data analysis and a simulation study, we show that the proposed MANNC outperforms other methods in terms of classification accuracy.
https://doi.org/10.5351/KJAS.2010.23.6.1093 인용 PDF KSCI

Classification of Power Quality Disturbances Using Feature Vector Combination and Neural Networks (특징벡터 결합과 신경회로망을 이용한 전력외란 식별)

Nam, Sang-Won
- Proceedings of the KIEE Conference
- /
- 1997.11a
- /
- pp.671-674
- /
- 1997
The objective of this paper is to present a new feature-vector extraction method for the automatic detection and classification of power quality(PQ) disturbances, where FIT, DWT(Discrete Wavelet Transform), and Fisher's criterion are utilized to extract an appropriate feature vector. In particular, the proposed classifier consists of three parts: i.e., (i) automatic detection of PQ disturbances, where the wavelet transform and signal power estimation method are utilized to detect each disturbance, (ii) feature vector extraction from the detected disturbance, and (iii) automatic classification, where Multi-Layer Perceptron(MLP) is used to classify each disturbance from the corresponding extracted feature vector. To demonstrate the performance and applicability of the proposed classification algorithm, some test results obtained by analyzing 10-class power quality disturbances are also provided.
PDF

Morphological Characterization and Classification of Anuran Tadpoles in Korea

Park, Dae-Sik;Cheong, Seo-Kwan;Sung, Ha-Cheol
- Journal of Ecology and Environment
- /
- v.29 no.5
- /
- pp.425-432
- /
- 2006
The tadpoles of 12 Korean anuran species, including Bombina orientalis, Bufo gargarizans, B. stejnegeri, Hyla japonica, Kaloula borealis, Rana dybowskii, R. huanrenensis, R. coreana, R. nigromaculata, R. chosenica, R. rugosa, and R. catesbeiana, were classified based on their morphological characteristics. We collected eggs or tadpoles of the 12 Korean anuran species from Gangwon, Incheon, Chungcheong, and Gyeonggi districts in 2005 and 2006 breeding seasons. When the tadpoles reached at $27{\sim}37$ Gosner's developmental stages, we described morphological characteristics of the tadpoles of each anuran species and measured their physical parameters such as total length, body length, and body mass. After that, we chose 12 morphological characteristics to identify each species and to use them as classification keys such as eye location, caudal musculature pattern, spiracle location, oral disc morphology, and labial tooth row formula. In this paper, we presented classification keys, morphological characteristics, and drawings for the tadpoles of 12 anuran species.
https://doi.org/10.5141/JEFB.2006.29.5.425 인용 PDF KSCI

A Study on Automatic Classification of Newspaper Articles Based on Unsupervised Learning by Departments (비지도학습 기반의 행정부서별 신문기사 자동분류 연구)

Kim, Hyun-Jong;Ryu, Seung-Eui;Lee, Chul-Ho;Nam, Kwang Woo
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.21 no.9
- /
- pp.345-351
- /
- 2020
Administrative agencies today are paying keen attention to big data analysis to improve their policy responsiveness. Of all the big data, news articles can be used to understand public opinion regarding policy and policy issues. The amount of news output has increased rapidly because of the emergence of new online media outlets, which calls for the use of automated bots or automatic document classification tools. There are, however, limits to the automatic collection of news articles related to specific agencies or departments based on the existing news article categories and keyword search queries. Thus, this paper proposes a method to process articles using classification glossaries that take into account each agency's different work features. To this end, classification glossaries were developed by extracting the work features of different departments using Word2Vec and topic modeling techniques from news articles related to different agencies. As a result, the automatic classification of newspaper articles for each department yielded approximately 71% accuracy. This study is meaningful in making academic and practical contributions because it presents a method of extracting the work features for each department, and it is an unsupervised learning-based automatic classification method for automatically classifying news articles relevant to each agency.
https://doi.org/10.5762/KAIS.2020.21.9.345 인용 PDF KSCI

How is SWIR useful to discrimination and a classification of forest types?

Murakami, Takuhiko
- Proceedings of the KSRS Conference
- /
- 2003.11a
- /
- pp.760-762
- /
- 2003
This study confirmed the usefulness of short-wavelength infrared (SWIR) in the discrimination and classification of evergreen forest types. A forested area near Hisayama and Sasaguri in Fukuoka Prefecture, Japan, served as the study area. Warm-temperate forest vegetation dominates the study site vegetation. Coniferous plantation forest, natural broad-leaved forest, and bamboo forest were analyzed using LANDSAT5/TM and SPOT4/HRVIR remote sensing data. Samples were extracted for the three forest types, and reflectance factors were compared for each band. Kappa coefficients of various band combinations were also compared by classification accuracy. For the LANDSAT5/TM data observed in April, October, and November, Bands 5 and 7 showed significant differences between bamboo, broad-leaved, and coniferous forests. The same significant difference was not recognized in the visible or near-infrared regions. Classification accuracy, determined by supervised classification, indicated distinct improvements in band combinations with SWIR, as compared to those without SWIR. Similar results were found for both LANDSAT5/TM and SPOT4/HRVIR data. This study identified obvious advantages in using SWIR data in forest-type discrimination and classification.
PDF

Classification of Arrhythmia Based on Discrete Wavelet Transform and Rough Set Theory

Kim, M.J.;J.-S. Han;Park, K.H.;W.C. Bang;Z. Zenn Bien
- 제어로봇시스템학회:학술대회논문집
- /
- 2001.10a
- /
- pp.28.5-28
- /
- 2001
This paper investigates a classification method of the electrocardiogram (ECG) into different disease categories. The features for the classification of the ECG are the coefficients of the discrete wavelet transform (DWT) of ECG signals. The coefficients are calculated with Haar wavelet, and after DWT we can get 64 coefficients. Each coefficient has morphological information and they may be good features when conventional time-domain features are not available. Since all of them are not meaningful, it is needed to reduce the size of meaningful coefficients set. The distributions of each coefficient can be the rules to classify ECG signal. The optimally reduced feature set is obtained by fuzzy c-means algorithm and rough set theory. First, the each coefficient is clustered by fuzzy c-means algorithm and the clustered ...
PDF

Object Detection from High Resolution Satellite Image by Using Genetic Algorithms

Hosomura Tsukasa
- Proceedings of the KSRS Conference
- /
- 2005.10a
- /
- pp.123-125
- /
- 2005
Many researchers conducted the effort for improving the classification accuracy of satellite image. Most of the study has used optical spectrum information of each pixel for image classification. By applying this method for high resolution satellite image, number of class becomes increase. This situation is remarkable for house, because the roof of house has variety of many colors. Even if the classification is carried out for many classes, roof color information of each house is not necessary. Most of the case, we need the information that object is house or not. In this study, we propose the method for detecting the object by using Genetic Algorithms (GA). Aircraft was selected as object. It is easy for this object to detect in the airport. An aircraft was taken as a template. Object image was taken from QuickBird. Target image includes an aircraft and Haneda Airport. Chromosome has four or five parameters which are composed of number of template, position (x,y), rotation angle, rate of enlarge. Good results were obtained in the experiment.
PDF

Decomposition of category mixture in a pixel and its application for supervised image classification

Matsumoto, Masao;Arai, Kohei;Ishimatsu, Takakazu
- 제어로봇시스템학회:학술대회논문집
- /
- 1992.10b
- /
- pp.514-519
- /
- 1992
To make an accurate retrieval of the proportion of each category among mixed pixels (Mixel's) of a remotely sensed imagery, a maximum likelihood estimation method of category proportion is proposed. In this method, the observed multispectral vector is considered as probability variables along with the approximation that the supervised data of each category can be characterized by normal distribution. The results show that this method can retrieve accurate proportion of each category among Mixel's. And a index that can estimate the degree of error in each category is proposed. AS one of the application of the proportion estimation, a method for image classification based on category proportion estimation is proposed. In this method all pixel in a remotely sensed imagery are assumed to be Mixel's, and are classified to most dominant category. Among the Mixel's, there exists unconfidential pixels which should be categorized as unclassified pixels. In order to discriminate them, two types of criteria, Chi square and AIC, are proposed for fitness test on pure pixel hypothesis. Experimental result with a simulated dataset show an usefulness of proposed classification criterion compared to the conventional maximum likelihood criterion and applicability of the fitness tests based on Chi square and AIC,
PDF

An Analysis of Service Classification Systems Provided by Major Korean Search Portals (주요 포털들의 서비스 분류체계 비교 분석)

Park, So-Yeon
- Journal of the Korean Society for Library and Information Science
- /
- v.44 no.2
- /
- pp.241-262
- /
- 2010
This study aims to perform an evaluation of classification systems provided by major Korean search portals, Naver, Nate, Daum, and Yahoo-Korea. These classification systems are evaluated in terms of the consistency of classification system, logicality of classification system, ease of interface, clarity of category names, order of category and site listing, and hierarchical structure. The results of this study show that each search portal provides separate classification systems for their services. These results imply that it is crucial for search portals to implement a common classification system and a common interface for their services. This study could contribute to the development and improvement of portals' classification systems.
https://doi.org/10.4275/KSLIS.2010.44.2.241 인용 PDF

Search Result 3,953, Processing Time 0.041 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)