• 제목/요약/키워드: random rotation ensemble

검색결과 6건 처리시간 0.016초

투영 조합을 통한 빅데이터 앙상블 모형 (Ensemble model through mixed projections useful for big data analytics)

  • 박혜준;김현중;이영섭
    • 응용통계연구
    • /
    • 제37권5호
    • /
    • pp.691-702
    • /
    • 2024
  • 이 논문에서는 빅데이터 분석 분야에서 유용하게 사용할 수 있는 새로운 분류 앙상블 방법인 mixed projection forest (MPF)를 제안하였다. 앙상블 내 개별 분류기를 학습할 때, MPF는 주성분 분석(PCA)과 정준 선형 판별 분석(CDA) 등의 데이터 투영 기법의 조합에 의한 회전 행렬을 활용한다. 이를 통해 경사 초평면을 사용함으로써 각 분류기의 정확성을 향상시킨다. 또한 변수 집합의 랜덤 분할을 이용해 다양한 회전 행렬을 도출하여 개별 분류기들의 다양성을 증대시킨다. 이러한 접근 방식은 궁극적으로 분류 성능을 향상시켜 정밀도가 필요한 빅데이터 분석에 매우 효과적이다. 이 논문에서는 실제 및 가상의 30개 데이터셋을 사용하여 MPF와 전통적인 분류 앙상블 모형의 성능을 비교하였다. 결과적으로, MPF는 분류 성능 및 분류기의 다양성 측면에서 우수한 경쟁력을 가진다는 것을 확인할 수 있었다.

수중 표적 식별을 위한 앙상블 학습 (Ensemble Learning for Underwater Target Classification)

  • 석종원
    • 한국멀티미디어학회논문지
    • /
    • 제18권11호
    • /
    • pp.1261-1267
    • /
    • 2015
  • The problem of underwater target detection and classification has been attracted a substantial amount of attention and studied from many researchers for both military and non-military purposes. The difficulty is complicate due to various environmental conditions. In this paper, we study classifier ensemble methods for active sonar target classification to improve the classification performance. In general, classifier ensemble method is useful for classifiers whose variances relatively large such as decision trees and neural networks. Bagging, Random selection samples, Random subspace and Rotation forest are selected as classifier ensemble methods. Using the four ensemble methods based on 31 neural network classifiers, the classification tests were carried out and performances were compared.

Learning to Prevent Inactive Student of Indonesia Open University

  • Tama, Bayu Adhi
    • Journal of Information Processing Systems
    • /
    • 제11권2호
    • /
    • pp.165-172
    • /
    • 2015
  • The inactive student rate is becoming a major problem in most open universities worldwide. In Indonesia, roughly 36% of students were found to be inactive, in 2005. Data mining had been successfully employed to solve problems in many domains, such as for educational purposes. We are proposing a method for preventing inactive students by mining knowledge from student record systems with several state of the art ensemble methods, such as Bagging, AdaBoost, Random Subspace, Random Forest, and Rotation Forest. The most influential attributes, as well as demographic attributes (marital status and employment), were successfully obtained which were affecting student of being inactive. The complexity and accuracy of classification techniques were also compared and the experimental results show that Rotation Forest, with decision tree as the base-classifier, denotes the best performance compared to other classifiers.

A Comparative Study of Phishing Websites Classification Based on Classifier Ensemble

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • 한국멀티미디어학회논문지
    • /
    • 제21권5호
    • /
    • pp.617-625
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.

Automated Phase Identification in Shingle Installation Operation Using Machine Learning

  • Dutta, Amrita;Breloff, Scott P.;Dai, Fei;Sinsel, Erik W.;Warren, Christopher M.;Wu, John Z.
    • 국제학술발표논문집
    • /
    • The 9th International Conference on Construction Engineering and Project Management
    • /
    • pp.728-735
    • /
    • 2022
  • Roofers get exposed to increased risk of knee musculoskeletal disorders (MSDs) at different phases of a sloped shingle installation task. As different phases are associated with different risk levels, this study explored the application of machine learning for automated classification of seven phases in a shingle installation task using knee kinematics and roof slope information. An optical motion capture system was used to collect knee kinematics data from nine subjects who mimicked shingle installation on a slope-adjustable wooden platform. Four features were used in building a phase classification model. They were three knee joint rotation angles (i.e., flexion, abduction-adduction, and internal-external rotation) of the subjects, and the roof slope at which they operated. Three ensemble machine learning algorithms (i.e., random forests, decision trees, and k-nearest neighbors) were used for training and prediction. The simulations indicate that the k-nearest neighbor classifier provided the best performance, with an overall accuracy of 92.62%, demonstrating the considerable potential of machine learning methods in detecting shingle installation phases from workers knee joint rotation and roof slope information. This knowledge, with further investigation, may facilitate knee MSD risk identification among roofers and intervention development.

  • PDF

A Comparative Study of Phishing Websites Classification Based on Classifier Ensembles

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Multimedia Information System
    • /
    • 제5권2호
    • /
    • pp.99-104
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.