• Title/Summary/Keyword: tree classification method

Search Result 361, Processing Time 0.028 seconds

Application of Random Forest Algorithm for the Decision Support System of Medical Diagnosis with the Selection of Significant Clinical Test (의료진단 및 중요 검사 항목 결정 지원 시스템을 위한 랜덤 포레스트 알고리즘 적용)

  • Yun, Tae-Gyun;Yi, Gwan-Su
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.57 no.6
    • /
    • pp.1058-1062
    • /
    • 2008
  • In clinical decision support system(CDSS), unlike rule-based expert method, appropriate data-driven machine learning method can easily provide the information of individual feature(clinical test) for disease classification. However, currently developed methods focus on the improvement of the classification accuracy for diagnosis. With the analysis of feature importance in classification, one may infer the novel clinical test sets which highly differentiate the specific diseases or disease states. In this background, we introduce a novel CDSS that integrate a classifier and feature selection module together. Random forest algorithm is applied for the classifier and the feature importance measure. The system selects the significant clinical tests discriminating the diseases by examining the classification error during backward elimination of the features. The superior performance of random forest algorithm in clinical classification was assessed against artificial neural network and decision tree algorithm by using breast cancer, diabetes and heart disease data in UCI Machine Learning Repository. The test with the same data sets shows that the proposed system can successfully select the significant clinical test set for each disease.

A Study on Building Structures and Processes for Intelligent Web Document Classification (지능적인 웹문서 분류를 위한 구조 및 프로세스 설계 연구)

  • Jang, Young-Cheol
    • Journal of Digital Convergence
    • /
    • v.6 no.4
    • /
    • pp.177-183
    • /
    • 2008
  • This paper aims to offer a solution based on intelligent document classification to create a user-centric information retrieval system allowing user-centric linguistic expression. So, structures expressing user intention and fine document classifying process using EBL, similarity, knowledge base, user intention, are proposed. To overcome the problem requiring huge and exact semantic information, a hybrid process is designed integrating keyword, thesaurus, probability and user intention information. User intention tree hierarchy is build and a method of extracting group intention between key words and user intentions is proposed. These structures and processes are implemented in HDCI(Hybrid Document Classification with Intention) system. HDCI consists of analyzing user intention and classifying web documents stages. Classifying stage is composed of knowledge base process, similarity process and hybrid coordinating process. With the help of user intention related structures and hybrid coordinating process, HDCI can efficiently categorize web documents in according to user's complex linguistic expression with small priori information.

  • PDF

Deep Learning Based Tree Recognition rate improving Method for Elementary and Middle School Learning

  • Choi, Jung-Eun;Yong, Hwan-Seung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.12
    • /
    • pp.9-16
    • /
    • 2019
  • The goal of this study is to propose an efficient model for recognizing and classifying tree images to measure the accuracy that can be applied to smart devices during class. From the 2009 revised textbook to the 2015 revised textbook, the learning objective to the fourth-grade science textbook of elementary schools was added to the plant recognition utilizing smart devices. In this study, we compared the recognition rates of trees before and after retraining using a pre-trained inception V3 model, which is the support of the Google Inception V3. In terms of tree recognition, it can distinguish several features, including shapes, bark, leaves, flowers, and fruits that may lead to the recognition rate. Furthermore, if all the leaves of trees may fall during winter, it may challenge to identify the type of tree, as only the bark of the tree will remain some leaves. Therefore, the effective tree classification model is presented through the combination of the images by tree type and the method of combining the model for the accuracy of each tree type. I hope that this model will apply to smart devices used in educational settings.

Missing Value Imputation Method Using CART : For Marital Status in the Population and Housing Census (CART를 활용한 결측값 대체방법 : 인구주택총조사 혼인상태 항목을 중심으로)

  • 김영원;이주원
    • Survey Research
    • /
    • v.4 no.2
    • /
    • pp.1-21
    • /
    • 2003
  • We proposed imputation strategies for marital status in the Population and Housing Census 2000 in Korea to illustrate the effective missing value imputation methods for social survey. The marital status which have relatively high non-response rates in the Census are considered to develope the effective missing value imputation procedures. The Classification and Regression Tree(CART)is employed to construct the imputation cells for hot-deck imputation, as well as to predict the missing value by model-based approach. We compare to imputation methods which include the CART model-based imputation and the sequential hot-deck imputation based on CART. Also we check whether different modeling for each region provides the more improved results. The results suggest that the proposed hot-deck imputation based on CART is very efficient and strongly recommendable. And the results show that different modeling for each region is not necessary.

  • PDF

An Analysis of Nursing Needs for Hospitalized Cancer Patients;Using Data Mining Techniques (데이터 마이닝을 이용한 입원 암 환자 간호 중증도 예측모델 구축)

  • Park, Sun-A
    • Asian Oncology Nursing
    • /
    • v.5 no.1
    • /
    • pp.3-10
    • /
    • 2005
  • Back ground: Nurses now occupy one third of all hospital human resources. Therefore, efficient management of nursing manpower is getting more important. While it is very clear that nursing workload requirement analysis and patient severity classification should be done first for the efficient allocation of nursing workforce, these processes have been conducted manually with ad hoc rule. Purposes: This study was tried to make a predict model for patient classification according to nursing need. We tried to find the easier and faster method to classify nursing patients that can help efficient management of nursing manpower. Methods: The nursing patient classifications data of the hospitalized cancer patients in one of the biggest cancer center in Korea during 2003.1.1-2003.12.31 were assessed by trained nurses. This study developed a prediction model and analyzing nursing needs by data mining techniques. Patients were classified by three different data mining techniques, (Logistic regression, Decision tree and Neural network) and the results were assessed. Results: The data set was created using 165,073 records of 2,228 patients classification database. Main explaining variables were as follows in 3 different data mining techniques. 1) Logistic regression : age, month and section. 2) Decision tree : section, month, age and tumor. 3) Neural network : section, diagnosis, age, sex, metastasis, hospital days and month. Among these three techniques, neural network showed the best prediction power in ROC curve verification. As the result of the patient classification prediction model developed by neural network based on nurse needs, the prediction accuracy was 84.06%. Conclusion: The patient classification prediction model was developed and tested in this study using real patients data. The result can be employed for more accurate calculation of required nursing staff and effective use of labor force.

  • PDF

Automatic Recognition of Digital Modulation Types using Wavelet Transformation (웨이브릿 변환을 이용한 디지털 변조타입 자동 인식)

  • Park, Cheol-Sun;Nah, Sun-Phil;Yang, Jong-Won;Choi, Jun-Ho
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.45 no.4
    • /
    • pp.22-30
    • /
    • 2008
  • In this paper, we deal with modulation classification method using WT capable of classifying incident digital signals without a priori information. These key features should have good properties of sensitive with modulation types and insensitive with SNR variation. The 4 key features for modulation recognition are selected using WT coefficients, which have the property of insentive to the changing of noise. The numerical simulations for classifying 8 digital modulation types using these features are peformed. The numerical simulations of the 3 types (i.e. DTC, MDC, and SVMC) of modulation classifiers are performed the investigation of classification accuracy and execution time to design the modulation classification module in software radio. The simulation result indicated that the execution time of MDC and DTC was best and MDC and SVMC showed good classification performance.

Proposal of e-Book Classification Method using DRFP-Tree (DRFP-Tree를 이용한 e-Book분류방법 제안)

  • Kim, Jong Yeup;Cho, Kyung Soo;Kim, Ung-mo
    • Annual Conference of KIPS
    • /
    • 2010.11a
    • /
    • pp.6-9
    • /
    • 2010
  • 2007년 Amazon.com이 미국에서 e-Book 전용 단말기 'Kindle'을 출시한 이래, Sony와 대형 서점 Barnes&Noble등 메이저 업체는 물론 다수의 중소업체들이 e-Book 시장에 진출하고 있다. 최근에는 Apple이 iPad를 출시하고 e-Book 시장에 진출한 가운데, Google 역시 6월 이후 e-Book 시장에 진출할 것을 발표함으로써 e-Book 시장의 경쟁이 더욱 치열해지고 있다. e-Book의 급속한 보급증가와 함께 이런 방대한 도서를 관리하는 곳에서 자동 도서 분류의 필요성도 증가하고 있다. 기존의 문서분류 방법들은 대게 수작업, 텍스트(단어)의 집합으로 간주하여 기계 학습방법을 그대로 적용하거나 약간의 변형을 가한 방법들이 대부분 이었다. 본 제안서에서는 데이터 마이닝 분야에서 사용되는 DRFP-Tree 구조를 이용하여 e-Book 내의 문장들의 패턴을 저장하고 이를 사용하여 e-Book을 분류하는 방법을 제안한다.

Korean Traditional Music Genre Classification Using Sample and MIDI Phrases

  • Lee, JongSeol;Lee, MyeongChun;Jang, Dalwon;Yoon, Kyoungro
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.4
    • /
    • pp.1869-1886
    • /
    • 2018
  • This paper proposes a MIDI- and audio-based music genre classification method for Korean traditional music. There are many traditional instruments in Korea, and most of the traditional songs played using the instruments have similar patterns and rhythms. Although music information processing such as music genre classification and audio melody extraction have been studied, most studies have focused on pop, jazz, rock, and other universal genres. There are few studies on Korean traditional music because of the lack of datasets. This paper analyzes raw audio and MIDI phrases in Korean traditional music, performed using Korean traditional musical instruments. The classified samples and MIDI, based on our classification system, will be used to construct a database or to implement our Kontakt-based instrument library. Thus, we can construct a management system for a Korean traditional music library using this classification system. Appropriate feature sets for raw audio and MIDI phrases are proposed and the classification results-based on machine learning algorithms such as support vector machine, multi-layer perception, decision tree, and random forest-are outlined in this paper.

Feature-Oriented Adaptive Motion Analysis For Recognizing Facial Expression (특징점 기반의 적응적 얼굴 움직임 분석을 통한 표정 인식)

  • Noh, Sung-Kyu;Park, Han-Hoon;Shin, Hong-Chang;Jin, Yoon-Jong;Park, Jong-Il
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.667-674
    • /
    • 2007
  • Facial expressions provide significant clues about one's emotional state; however, it always has been a great challenge for machine to recognize facial expressions effectively and reliably. In this paper, we report a method of feature-based adaptive motion energy analysis for recognizing facial expression. Our method optimizes the information gain heuristics of ID3 tree and introduces new approaches on (1) facial feature representation, (2) facial feature extraction, and (3) facial feature classification. We use minimal reasonable facial features, suggested by the information gain heuristics of ID3 tree, to represent the geometric face model. For the feature extraction, our method proceeds as follows. Features are first detected and then carefully "selected." Feature "selection" is finding the features with high variability for differentiating features with high variability from the ones with low variability, to effectively estimate the feature's motion pattern. For each facial feature, motion analysis is performed adaptively. That is, each facial feature's motion pattern (from the neutral face to the expressed face) is estimated based on its variability. After the feature extraction is done, the facial expression is classified using the ID3 tree (which is built from the 1728 possible facial expressions) and the test images from the JAFFE database. The proposed method excels and overcomes the problems aroused by previous methods. First of all, it is simple but effective. Our method effectively and reliably estimates the expressive facial features by differentiating features with high variability from the ones with low variability. Second, it is fast by avoiding complicated or time-consuming computations. Rather, it exploits few selected expressive features' motion energy values (acquired from intensity-based threshold). Lastly, our method gives reliable recognition rates with overall recognition rate of 77%. The effectiveness of the proposed method will be demonstrated from the experimental results.

  • PDF

A Study on Improving the predict accuracy rate of Hybrid Model Technique Using Error Pattern Modeling : Using Logistic Regression and Discriminant Analysis

  • Cho, Yong-Jun;Hur, Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.269-278
    • /
    • 2006
  • This paper presents the new hybrid data mining technique using error pattern, modeling of improving classification accuracy. The proposed method improves classification accuracy by combining two different supervised learning methods. The main algorithm generates error pattern modeling between the two supervised learning methods(ex: Neural Networks, Decision Tree, Logistic Regression and so on.) The Proposed modeling method has been applied to the simulation of 10,000 data sets generated by Normal and exponential random distribution. The simulation results show that the performance of proposed method is superior to the existing methods like Logistic regression and Discriminant analysis.

  • PDF