• Title/Summary/Keyword: tree-based classification

Search Result 494, Processing Time 0.026 seconds

Classification of Brain Magnetic Resonance Images using 2 Level Decision Tree Learning (2 단계 결정트리 학습을 이용한 뇌 자기공명영상 분류)

  • Kim, Hyung-Il;Kim, Yong-Uk
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.1
    • /
    • pp.18-29
    • /
    • 2007
  • In this paper we present a system that classifies brain MR images by using 2 level decision tree learning. There are two kinds of information that can be obtained from images. One is the low-level features such as size, color, texture, and contour that can be acquired directly from the raw images, and the other is the high-level features such as existence of certain object, spatial relations between different parts that must be obtained through the interpretation of segmented images. Learning and classification should be performed based on the high-level features to classify images according to their semantic meaning. The proposed system applies decision tree learning to each level separately, and the high-level features are synthesized from the results of low-level classification. The experimental results with a set of brain MR images with tumor are discussed. Several experimental results that show the effectiveness of the proposed system are also presented.

The Training Data Generation and a Technique of Phylogenetic Tree Generation using Decision Tree (트레이닝 데이터 생성과 의사 결정 트리를 이용한 계통수 생성 방법)

  • Chae, Deok-Jin;Sin, Ye-Ho;Cheon, Tae-Yeong;Go, Heung-Seon;Ryu, Geun-Ho;Hwang, Bu-Hyeon
    • The KIPS Transactions:PartD
    • /
    • v.10D no.6
    • /
    • pp.897-906
    • /
    • 2003
  • The traditional animal phylogenetic tree is to align the body structure of the animal phylums from simple to complex based on the initial development character. Currently, molecular systematics research based on the molecular, it is on the fly, is again estimating prior trend and show the new genealogy and interest of the evolution. In this paper, we generate the training set which is obtained from a DNA sequence ans apply to the classification. We made use of the mitochondrial DNA for the experiment, and then proved the accuracy using the MEGA program which is anaysis program, it is used in the biology field. Although the result of the mining has to proved through biological experiment, it can provede the methodology for the efficient classify and can reduce the time and effort to the experiment.

Core Keywords Extraction forEvaluating Online Consumer Reviews Using a Decision Tree: Focusing on Star Ratings and Helpfulness Votes (의사결정나무를 활용한 온라인 소비자 리뷰 평가에 영향을 주는 핵심 키워드 도출 연구: 별점과 좋아요를 중심으로)

  • Min, Kyeong Su;Yoo, Dong Hee
    • The Journal of Information Systems
    • /
    • v.32 no.3
    • /
    • pp.133-150
    • /
    • 2023
  • Purpose This study aims to develop classification models using a decision tree algorithm to identify core keywords and rules influencing online consumer review evaluations for the robot vacuum cleaner on Amazon.com. The difference from previous studies is that we analyze core keywords that affect the evaluation results by dividing the subjects that evaluate online consumer reviews into self-evaluation (star ratings) and peer evaluation (helpfulness votes). We investigate whether the core keywords influencing star ratings and helpfulness votes vary across different products and whether there is a similarity in the core keywords related to star ratings or helpfulness votes across all products. Design/methodology/approach We used random under-sampling to balance the dataset. We progressively removed independent variables based on decreasing importance through backwards elimination to evaluate the classification model's performance. As a result, we identified classification models that best predict star ratings and helpfulness votes for each product's online consumer reviews. Findings We have identified that the core keywords influencing self-evaluation and peer evaluation vary across different products, and even for the same model or features, the core keywords are not consistent. Therefore, companies' producers and marketing managers need to analyze the core keywords of each product to highlight the advantages and prepare customized strategies that compensate for the shortcomings.

Missing Value Imputation Method Using CART : For Marital Status in the Population and Housing Census (CART를 활용한 결측값 대체방법 : 인구주택총조사 혼인상태 항목을 중심으로)

  • 김영원;이주원
    • Survey Research
    • /
    • v.4 no.2
    • /
    • pp.1-21
    • /
    • 2003
  • We proposed imputation strategies for marital status in the Population and Housing Census 2000 in Korea to illustrate the effective missing value imputation methods for social survey. The marital status which have relatively high non-response rates in the Census are considered to develope the effective missing value imputation procedures. The Classification and Regression Tree(CART)is employed to construct the imputation cells for hot-deck imputation, as well as to predict the missing value by model-based approach. We compare to imputation methods which include the CART model-based imputation and the sequential hot-deck imputation based on CART. Also we check whether different modeling for each region provides the more improved results. The results suggest that the proposed hot-deck imputation based on CART is very efficient and strongly recommendable. And the results show that different modeling for each region is not necessary.

  • PDF

Opinion Extraction based on Syntactic Pieces

  • Aoki, Suguru;Yamamoto, Kazuhide
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.76-85
    • /
    • 2007
  • This paper addresses a task of opinion extraction from given documents and its positive/negative classification. We propose a sentence classification method using a notion of syntactic piece. Syntactic piece is a minimum unit of structure, and is used as an alternative processing unit of n-gram and whole tree structure. We compute its semantic orientation, and classify opinion sentences into positive or negative. We have conducted an experiment on more than 5000 opinion sentences of multiple domains, and have proven that our approach attains high performance at 91% precision.

  • PDF

Optimum Range Cutting for Packet Classification (최적화된 영역 분할을 이용한 패킷 분류 알고리즘)

  • Kim, Hyeong-Gee;Park, Kyong-Hye;Lim, Hye-Sook
    • Journal of KIISE:Information Networking
    • /
    • v.35 no.6
    • /
    • pp.497-509
    • /
    • 2008
  • Various algorithms and architectures for efficient packet classification have been widely studied. Packet classification algorithms based on a decision tree structure such as HiCuts and HyperCuts are known to be the best by exploiting the geometrical representation of rules in a classifier. However, the algorithms are not practical since they involve complicated heuristics in selecting a dimension of cuts and determining the number of cuts at each node of the decision tree. Moreover, the cutting is not efficient enough since the cutting is based on regular interval which is not related to the actual range that each rule covers. In this paper, we proposed a new efficient packet classification algorithm using a range cutting. The proposed algorithm primarily finds out the ranges that each rule covers in 2-dimensional prefix plane and performs cutting according to the ranges. Hence, the proposed algorithm constructs a very efficient decision tree. The cutting applied to each node of the decision tree is optimal and deterministic not involving the complicated heuristics. Simulation results for rule sets generated using class-bench databases show that the proposed algorithm has better performance in average search speed and consumes up to 3-300 times less memory space compared with previous cutting algorithms.

On the Tree Model grown by one-sided purity (단측 순수성에 의한 나무모형의 성장에 대하여)

  • 김용대;최대우
    • Journal of Intelligence and Information Systems
    • /
    • v.7 no.1
    • /
    • pp.17-25
    • /
    • 2001
  • Tree model is the most popular classification algorithm in data mining due to easy interpretation of the result. In CART(Breiman et al., 1984) and C4.5(Quinlan, 1993) which are representative of tree algorithms, the split fur classification proceeds to attain the homogeneous terminal nodes with respect to the composition of levels in target variable. But, fur instance, in the chum prediction modeling fur CRM(Customer Relationship management), the rate of churn is generally very low although we are interested in mining the churners. Thus it is difficult to get accurate prediction modes using tree model based on the traditional split rule, such as mini or deviance. Buja and Lee(1999) introduced a new split rule, one-sided purity for classifying minor interesting group. In this paper, we compared one-sided purity with traditional split rule, deviance analyzing churning vs. non-churning data of ISP company. Also reviewing the result of tree model based on one-sided purity with some simulated data, we discussed problems and researchable topics.

  • PDF

Two-dimensional Binary Search Tree for Packet Classification at Internet Routers (인터넷 라우터에서의 패킷 분류를 위한 2차원 이진 검색 트리)

  • Lee, Goeun;Lim, Hyesook
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.6
    • /
    • pp.21-31
    • /
    • 2015
  • The Internet users want to get real-time services for various multi-media applications. Network traffic rate has been rapidly increased, and data amounts that the Internet has to carry have been exponentially increased. A packet is the basic unit in transferring data at the Internet, and packet classification is one of the most challenging functionalities that routers should perform at wire-speed. Among various known packet classification algorithms, area-based quad-trie (AQT) algorithm is one of the efficient algorithms which can lookup five header fields simultaneously. As a representative space decomposition algorithm, the AQT requires a small amount of memory in storing classification rules, but it does not provide high-speed classification performance. In this paper, we propose a new packet classification algorithm by applying a binary search for the codewords of the AQT to overcome the issue of the AQT. Throughout simulation, it is shown that the proposed algorithm provides a better performance than the AQT in the number of rule comparisons with each input packet.

Local Feature Based Facial Expression Recognition Using Adaptive Decision Tree (적응형 결정 트리를 이용한 국소 특징 기반 표정 인식)

  • Oh, Jihun;Ban, Yuseok;Lee, Injae;Ahn, Chunghyun;Lee, Sangyoun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39A no.2
    • /
    • pp.92-99
    • /
    • 2014
  • This paper proposes the method of facial expression recognition based on decision tree structure. In the image of facial expression, ASM(Active Shape Model) and LBP(Local Binary Pattern) make the local features of a facial expressions extracted. The discriminant features gotten from local features make the two facial expressions of all combination classified. Through the sum of true related to classification, the combination of facial expression and local region are decided. The integration of branch classifications generates decision tree. The facial expression recognition based on decision tree shows better recognition performance than the method which doesn't use that.

Korean Traditional Music Genre Classification Using Sample and MIDI Phrases

  • Lee, JongSeol;Lee, MyeongChun;Jang, Dalwon;Yoon, Kyoungro
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.4
    • /
    • pp.1869-1886
    • /
    • 2018
  • This paper proposes a MIDI- and audio-based music genre classification method for Korean traditional music. There are many traditional instruments in Korea, and most of the traditional songs played using the instruments have similar patterns and rhythms. Although music information processing such as music genre classification and audio melody extraction have been studied, most studies have focused on pop, jazz, rock, and other universal genres. There are few studies on Korean traditional music because of the lack of datasets. This paper analyzes raw audio and MIDI phrases in Korean traditional music, performed using Korean traditional musical instruments. The classified samples and MIDI, based on our classification system, will be used to construct a database or to implement our Kontakt-based instrument library. Thus, we can construct a management system for a Korean traditional music library using this classification system. Appropriate feature sets for raw audio and MIDI phrases are proposed and the classification results-based on machine learning algorithms such as support vector machine, multi-layer perception, decision tree, and random forest-are outlined in this paper.