• 제목/요약/키워드: Dimensionality Reduction

검색결과 201건 처리시간 0.028초

An Assessment of a Random Forest Classifier for a Crop Classification Using Airborne Hyperspectral Imagery

  • Jeon, Woohyun;Kim, Yongil
    • 대한원격탐사학회지
    • /
    • 제34권1호
    • /
    • pp.141-150
    • /
    • 2018
  • Crop type classification is essential for supporting agricultural decisions and resource monitoring. Remote sensing techniques, especially using hyperspectral imagery, have been effective in agricultural applications. Hyperspectral imagery acquires contiguous and narrow spectral bands in a wide range. However, large dimensionality results in unreliable estimates of classifiers and high computational burdens. Therefore, reducing the dimensionality of hyperspectral imagery is necessary. In this study, the Random Forest (RF) classifier was utilized for dimensionality reduction as well as classification purpose. RF is an ensemble-learning algorithm created based on the Classification and Regression Tree (CART), which has gained attention due to its high classification accuracy and fast processing speed. The RF performance for crop classification with airborne hyperspectral imagery was assessed. The study area was the cultivated area in Chogye-myeon, Habcheon-gun, Gyeongsangnam-do, South Korea, where the main crops are garlic, onion, and wheat. Parameter optimization was conducted to maximize the classification accuracy. Then, the dimensionality reduction was conducted based on RF variable importance. The result shows that using the selected bands presents an excellent classification accuracy without using whole datasets. Moreover, a majority of selected bands are concentrated on visible (VIS) region, especially region related to chlorophyll content. Therefore, it can be inferred that the phenological status after the mature stage influences red-edge spectral reflectance.

BCI에서 기계 학습을 위한 간질 뇌파 특징 선택을 통한 차원 감소 방법 분석 (Analysis of Dimensionality Reduction Methods Through Epileptic EEG Feature Selection for Machine Learning in BCI)

  • 양통;;임창균
    • 한국전자통신학회논문지
    • /
    • 제13권6호
    • /
    • pp.1333-1342
    • /
    • 2018
  • 지금까지 뇌파(Electroencephalography - EEG)는 뇌전증 진단 및 치료를 위한 가장 중요하고 편리한 방법이었다. 그러나 뇌전증 뇌파 신호의 파형 특성은 매우 약하고 비 정지 상태이며 배경 노이즈가 강하기 때문에 식별하기가 어렵다. 이 논문에서는 간질 뇌파의 특징 선택을 통한 차원 감소를 통한 분류 방법의 효과를 분석한다. 우리는 차원 감소를 위해 주 요소 분석, 커널 요소 분석, 선형 판별 분석 방법을 사용하였다. 차원 감소방법의 성능 분석을 위해 Support Vector Machine: SVM), Logistic Regression(: LR), K-Nearestneighbor(: K-NN), Decision Tree(: DR), Random Forest(: RF) 분류 방법들을 사용해 평가하였다. 실험 결과에 따르면, PCA는 SVM, LR 및 K-NN에서 75% 정확도를 나타냈다. KPCA는 SVM과 K-KNN에서 85%의 성능을 보였으며 LDA는 K-NN를 이용했을 때 100 %의 정확도 보여주었다. 따라서 LDA를 이용한 차원 감소가 뇌전증 EEG 신호에 대한 최고의 분류 결과 보여주었다.

기계학습 기반 랜섬웨어 공격 탐지를 위한 효과적인 특성 추출기법 비교분석 (Comparative Analysis of Dimensionality Reduction Techniques for Advanced Ransomware Detection with Machine Learning)

  • 김한석;이수진
    • 융합보안논문지
    • /
    • 제23권1호
    • /
    • pp.117-123
    • /
    • 2023
  • 점점 더 고도화되고 있는 랜섬웨어 공격을 기계학습 기반 모델로 탐지하기 위해서는, 분류 모델이 고차원의 특성을 가지는 학습데이터를 훈련해야 한다. 그리고 이 경우 '차원의 저주' 현상이 발생하기 쉽다. 따라서 차원의 저주 현상을 회피하면서 학습모델의 정확성을 높이고 실행 속도를 향상하기 위해 특성의 차원 축소가 반드시 선행되어야 한다. 본 논문에서는 특성의 차원이 극단적으로 다른 2종의 데이터세트를 대상으로 3종의 기계학습 모델과 2종의 특성 추출기법을 적용하여 랜섬웨어 분류를 수행하였다. 실험 결과, 이진 분류에서는 특성 차원 축소기법이 성능 향상에 큰 영향을 미치지 않았으며, 다중 분류에서도 데이터세트의 특성 차원이 작을 경우에는 동일하였다. 그러나 학습데이터가 고차원의 특성을 가지는 상황에서 다중 분류를 시도했을 경우 LDA(Linear Discriminant Analysis)가 우수한 성능을 나타냈다.

Identification of epistasis in ischemic stroke using multifactor dimensionality reduction and entropy decomposition

  • Park, Jung-Dae;Kim, Youn-Young;Lee, Chae-Young
    • BMB Reports
    • /
    • 제42권9호
    • /
    • pp.617-622
    • /
    • 2009
  • We investigated the genetic associations of ischemic stroke by identifying epistasis of its heterogeneous subtypes such as small vessel occlusion (SVO) and large artery atherosclerosis (LAA). Epistasis was analyzed with 24 genes in 207 controls and 271 patients (SVO = 110, LAA = 95) using multifactor dimensionality reduction and entropy decomposition. The multifactor dimensionality reduction analysis with any of 1- to 4-locus models showed no significant association with LAA (P > 0.05). The analysis of SVO, however, revealed a significant association in the best 3-locus model with P10L of TGF-$\beta{1}$, C1013T of SPP1, and R485K of F5 (testing balanced accuracy = 63.17%, P < 0.05). Subsequent entropy analysis also revealed that such heterogeneity was present and quite a large entropy was estimated among the 3 loci for SVO (5.43%), but only a relatively small entropy was estimated for LAA (1.81%). This suggests that the synergistic epistasis model might contribute specifically to the pathogenetsis of SVO, which implies a different etiopathogenesis of the ischemic stroke subtypes.

Effective Dimensionality Reduction of Payload-Based Anomaly Detection in TMAD Model for HTTP Payload

  • Kakavand, Mohsen;Mustapha, Norwati;Mustapha, Aida;Abdullah, Mohd Taufik
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권8호
    • /
    • pp.3884-3910
    • /
    • 2016
  • Intrusion Detection System (IDS) in general considers a big amount of data that are highly redundant and irrelevant. This trait causes slow instruction, assessment procedures, high resource consumption and poor detection rate. Due to their expensive computational requirements during both training and detection, IDSs are mostly ineffective for real-time anomaly detection. This paper proposes a dimensionality reduction technique that is able to enhance the performance of IDSs up to constant time O(1) based on the Principle Component Analysis (PCA). Furthermore, the present study offers a feature selection approach for identifying major components in real time. The PCA algorithm transforms high-dimensional feature vectors into a low-dimensional feature space, which is used to determine the optimum volume of factors. The proposed approach was assessed using HTTP packet payload of ISCX 2012 IDS and DARPA 1999 dataset. The experimental outcome demonstrated that our proposed anomaly detection achieved promising results with 97% detection rate with 1.2% false positive rate for ISCX 2012 dataset and 100% detection rate with 0.06% false positive rate for DARPA 1999 dataset. Our proposed anomaly detection also achieved comparable performance in terms of computational complexity when compared to three state-of-the-art anomaly detection systems.

Gene-Gene Interaction Analysis for the Accelerated Failure Time Model Using a Unified Model-Based Multifactor Dimensionality Reduction Method

  • Lee, Seungyeoun;Son, Donghee;Yu, Wenbao;Park, Taesung
    • Genomics & Informatics
    • /
    • 제14권4호
    • /
    • pp.166-172
    • /
    • 2016
  • Although a large number of genetic variants have been identified to be associated with common diseases through genome-wide association studies, there still exits limitations in explaining the missing heritability. One approach to solving this missing heritability problem is to investigate gene-gene interactions, rather than a single-locus approach. For gene-gene interaction analysis, the multifactor dimensionality reduction (MDR) method has been widely applied, since the constructive induction algorithm of MDR efficiently reduces high-order dimensions into one dimension by classifying multi-level genotypes into high- and low-risk groups. The MDR method has been extended to various phenotypes and has been improved to provide a significance test for gene-gene interactions. In this paper, we propose a simple method, called accelerated failure time (AFT) UM-MDR, in which the idea of a unified model-based MDR is extended to the survival phenotype by incorporating AFT-MDR into the classification step. The proposed AFT UM-MDR method is compared with AFT-MDR through simulation studies, and a short discussion is given.

CART 알고리즘을 활용한 확장된 다중인자 차원축소방법의 검정력 평가 (Power of Expanded Multifactor Dimensionality Reduction with CART Algorithm)

  • 이제영;이종형;이호근
    • Communications for Statistical Applications and Methods
    • /
    • 제17권5호
    • /
    • pp.667-678
    • /
    • 2010
  • 인간의 유전자 상호작용을 분석하기 위해 제시된 다중인자 차원축소방법은 연속형자료에는 적용할 수 없다. 그래서 이를 보완한 CART 알고리즘을 활용한 확장된 다중인자 차원축소방법이 제안되었다. 하지만 CART 알고리즘을 활용한 확장된 다중인자 차원축소방법의 검정력이 밝혀지지 않았다. 따라서 본 연구에서는 모의실험을 통하여 CART 알고리즘을 활용한 확장된 다중인자 차원축소방법의 우수한 검정력을 평가하고, 확인된 검정력을 바탕으로 실제 한우 데이터에 적용하여 한우의 경제형질에 영향을 주는 우수 유전자조합을 규명하였다.

연속형자료의 유전자 상호작용 규명을 위한 SVM MDR과 D-MDR의 방법 비교 (A Comparison Study on SVM MDR and D-MDR for Detecting Gene-Gene Interaction in Continuous Data)

  • 이종형;이제영
    • Communications for Statistical Applications and Methods
    • /
    • 제18권4호
    • /
    • pp.413-422
    • /
    • 2011
  • 유전학에서 유전자 상호작용효과 규명을 위한 방법으로 비모수적인 방법인 Multifactor Dimensionality Reduction(MDR) 방법이 제안되어 현재까지 사용되고 있다. MDR 방법은 이분형 자료에 적합한 방법으로 연속형 자료에는 적용할 수 없는 단점이 있다. 이러한 한계를 극복하기 위해서 Dummy MDR(D-MDR) 방법 그리고 SVM을 활용한 MDR(SVM MDR) 방법 등이 제안 되었다. 본 논문에서는 연속형 자료에 적용 가능한 SVM MDR 방법과 D-MDR 방법을 비교하고, 실제 한우 데이터에 두 방법에 적용한다. 그리고 각 방법의 적용결과를 바탕으로 한우의 종합경제형질에 영향을 주는 유전자 상호작용 조합을 규명한다. 그리고 마지막으로 기존의 SVM MDR 방법과 D-MDR 방법의 장단점 비교를 통해서 추후 새로운 연구방향을 제시한다.

Multifactor-Dimensionality Reduction in the Presence of Missing Observations

  • Chung, Yu-Jin;Lee, Seung-Yeoun;Park, Tae-Sung
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2005년도 추계 학술발표회 논문집
    • /
    • pp.31-36
    • /
    • 2005
  • An identification and characterization of susceptibility genes for common complex multifactorial diseases is a challengeable task, in which the effect of single genetic variation will be likely dependent on other genetic variations(gene-gene interaction) and environmental factors (gene-environment interaction). To address is issue, the multifactor dimensionality reduction (MDR) has been proposed and implemented by Ritchie et al. (2001), Moore et al. (2002), Hahn et al.(2003) and Ritchie et al. (2003). With MDR, multilocus genotypes effectively reduce the dimension of genotype predictors from n to one, which improves the identification of polymorphism combinations associated with disease risk. However, MDR cannot handle missing observations appropriately, in which missing observation is treated as an additional genotype category. This approach may suffer from a sparseness problem since when high-order interactions are considered, an additional missing category would make the contingency table cells more sparse. We propose a new MDR approach with minimum loss of sample sizes by considering missing data over all possible multifactor classes. We evaluate the proposed MDR by using the prediction errors and cross validation consistency.

  • PDF

차원 감소 기법을 이용한 전자 상거래 추천 시스템 (Development of a Recommender System for E-Commerce Sites Using a Dimensionality Reduction Technique)

  • 김용수;염봉진
    • 대한산업공학회지
    • /
    • 제36권3호
    • /
    • pp.193-202
    • /
    • 2010
  • The recommender system is a typical software solution for personalized services which are now popular in e-commerce sites. Most of the existing recommender systems are based on customers' explicit rating data on items (e.g., ratings on movies), and it is only recently that recommender systems based on implicit ratings have been proposed as a better alternative. Implicit ratings of a customer on those items that are clicked but not purchased can be inferred from the customer's navigational and behavioral patterns. In this article, a dimensionality reduction (DR) technique is newly applied to the implicit rating-based recommender system, and its effectiveness is assessed using an experimental e-commerce site. The experimental results indicate that the performance of the proposed approach is superior or at least similar to the conventional collaborative filtering (CF)-based approach unless the number of recommended products is 'large.' In addition, the proposed approach requires less memory space and is computationally more efficient.