• Title/Summary/Keyword: Tree Segmentation

Search Result 97, Processing Time 0.026 seconds

A methodology for Internet Customer segmentation using Decision Trees

  • Cho, Y.B.;Kim, S.H.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2003.05a
    • /
    • pp.206-213
    • /
    • 2003
  • Application of existing decision tree algorithms for Internet retail customer classification is apt to construct a bushy tree due to imprecise source data. Even excessive analysis may not guarantee the effectiveness of the business although the results are derived from fully detailed segments. Thus, it is necessary to determine the appropriate number of segments with a certain level of abstraction. In this study, we developed a stopping rule that considers the total amount of information gained while generating a rule tree. In addition to forwarding from root to intermediate nodes with a certain level of abstraction, the decision tree is investigated by the backtracking pruning method with misclassification loss information.

  • PDF

A Study of Disambiguation Method To Improve The Syntactic Analysis System (구문 분석의 결과로 나타나는 구조의 모호성을 해결하기 위한 방법 연구)

  • Park, Yong Uk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.4
    • /
    • pp.2764-2769
    • /
    • 2015
  • In this paper, we present a Korean syntactic analysis system which can generate all possible syntactic trees in a given sentence. Therefore, the number of syntactic trees by this syntactic analysis system can be increased exponentially. To solve this problem, we suggest a segmentation method and maximum connected unit in a segmentation. Maximum connected unit is a combined unit which contains all morphemes in a segmentation. According to the input sentence, it is possible one or more maximum connected unit in a segmentation. We extract 516 sentences to experiment randomly from the text book of Korean middle school. We could reduce about 28% of the number of syntactic trees.

Automatic Segmentation of Retinal Blood Vessels Based on Improved Multiscale Line Detection

  • Hou, Yanli
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.2
    • /
    • pp.119-128
    • /
    • 2014
  • The appearance of retinal blood vessels is an important diagnostic indicator of serious disease, such as hypertension, diabetes, cardiovascular disease, and stroke. Automatic segmentation of the retinal vasculature is a primary step towards automatic assessment of the retinal blood vessel features. This paper presents an automated method for the enhancement and segmentation of blood vessels in fundus images. To decrease the influence of the optic disk, and emphasize the vessels for each retinal image, a multidirectional morphological top-hat transform with rotating structuring elements is first applied to the background homogenized retinal image. Then, an improved multiscale line detector is presented to produce a vessel response image, and yield the retinal blood vessel tree for each retinal image. Since different line detectors at varying scales have different line responses in the multiscale detector, the line detectors with longer length produce more vessel responses than the ones with shorter length; the improved multiscale detector combines all the responses at different scales by setting different weights for each scale. The methodology is evaluated on two publicly available databases, DRIVE and STARE. Experimental results demonstrate an excellent performance that approximates the average accuracy of a human observer. Moreover, the method is simple, fast, and robust to noise, so it is suitable for being integrated into a computer-assisted diagnostic system for ophthalmic disorders.

Machine Learning Based Automatic Categorization Model for Text Lines in Invoice Documents

  • Shin, Hyun-Kyung
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.12
    • /
    • pp.1786-1797
    • /
    • 2010
  • Automatic understanding of contents in document image is a very hard problem due to involvement with mathematically challenging problems originated mainly from the over-determined system induced by document segmentation process. In both academic and industrial areas, there have been incessant and various efforts to improve core parts of content retrieval technologies by the means of separating out segmentation related issues using semi-structured document, e.g., invoice,. In this paper we proposed classification models for text lines on invoice document in which text lines were clustered into the five categories in accordance with their contents: purchase order header, invoice header, summary header, surcharge header, purchase items. Our investigation was concentrated on the performance of machine learning based models in aspect of linear-discriminant-analysis (LDA) and non-LDA (logic based). In the group of LDA, na$\"{\i}$ve baysian, k-nearest neighbor, and SVM were used, in the group of non LDA, decision tree, random forest, and boost were used. We described the details of feature vector construction and the selection processes of the model and the parameter including training and validation. We also presented the experimental results of comparison on training/classification error levels for the models employed.

A Study on the Pattern Recognition of Korean Characters by Syntactic Method (Syntactic법에 의한 한글의 패턴 인식에 관한 연구)

  • ;安居院猛
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.14 no.5
    • /
    • pp.15-21
    • /
    • 1977
  • The syntactic pattern recognition system of Korean characters is composed of three main functional parts; Preprocessing, Graph-representation, and Segmentation. In preprocessing routine, the input pattern has been thinned using the Hilditch's thinning algorithm. The graph-representation is the detection of a number of nodes over the input pattern and codification of branches between nodes by 8 directional components. Next, segmentation routine which has been implemented by top down nondeterministic parsing under the control of tree grammar identifies parts of the graph-represented Pattern as basic components of Korean characters. The authors have made sure that this system is effective for recognizing Korean characters through the recognition simulations by digital computer.

  • PDF

A Study on Customer Segmentation of the Home Study Company using Decision Tree (의사결정나무를 이용한 방문학습지사의 고객세분화에 관한 연구)

  • Seo Kwang-Kyu;Oh Yeun-Joo;Han Young-Kyu;Shim Hyun-Jeong
    • Proceedings of the KAIS Fall Conference
    • /
    • 2004.11a
    • /
    • pp.316-319
    • /
    • 2004
  • Due to keen competition among companies, companies have segmented customers and they are trying to offer specially targeted customer by means of the distinguished method. In accordance, data mining techniques are noted as the effective method that extracts useful information. This paper explores customer segmentation of the home study company using data mining. The purposes of this paper are especially competitor chum in the recent home study market, to understand the characteristics of the customer group who are expected chum in case competing companies do aggressive sales promotion. In addition, this paper aims to find the influential factors of their breakaway, and to prepare practical marketing strategy to keep the existing customers. The study of chum in the home study market is conducted and the model using decision tree to predict and select valuable customer. Finally, this paper presents how the results can be incorporated and measured as a part of an overall marketing campaign process.

  • PDF

Unit Nonresponse Weighting Adjustment Using Regression Tree (회귀나무를 이용한 무응답 가중치 조정)

  • Kim, Se-Mi;Lee, Seok-Hun
    • Proceedings of the Korean Association for Survey Research Conference
    • /
    • 2005.12a
    • /
    • pp.169-183
    • /
    • 2005
  • This paper considers formation of nonresponse weighting adjustment cell for handling unit nonresponse in sample surveys. We propose a multivariate regression tree mehtod for segmentation using the variable of interest and the estimated response probability simultaneously to construct effective nonresponse adjustment cell. One is using only response data and the other is using response and nonresponse data. These two cases are compared in terms of bias.

  • PDF

Development Changes of Cambial Initials and Their Derivative Cells in the Trunk of Diospyros kaki THUNB. and Firmiana simplex W.F. WIGHT in Relation to Girth Increase (감나무와 벽오동 수간의 둘레증가에 따른 형성층 원시세포와 그 유도세포의 발생학적 변화)

  • 한경식
    • Journal of Plant Biology
    • /
    • v.34 no.3
    • /
    • pp.191-199
    • /
    • 1991
  • This study has been conducted to investigate the developmental changes of cambial initials and their derivatives in relation to the growing girth of tree in Diosypros kaki and Firmiana simplex. In D. Kaki and F. simplex with typical storeyed cambium, increase in the girth of camium occurred by radial anticlinal division in general, however occasionally the increase was companied by pseudotransverse division. The length of fusiform initials, vessel member, and sieve tube member remained relatively constant throughout the secondary growth but that of fiber showed general tendancy to increase with growing girth of tree. During the girth increase of tree, height and number of ray remained constant in D. kaki, however in F. simplex, height of ray markedly decreased while the number of ray per unit area more or less increased. The secondary ray was originated from the segmentation or division of the side or end of fusiform initials.

  • PDF

Detection of Individual Tree Stands by a Fusion of a Multispectral High-resolution Satellite Image and Laser Scanning Data

  • Teraoka, Masaki;Setojima, Masahiro;Imai, Yasuteru;Yasuoka, Yoshifumi
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1042-1044
    • /
    • 2003
  • A methodology of the integrating the similar color circle search of the spectral data and segmentation of the height data is developed. The method is then applied to study areas, and the results by IKONOS, LIDAR and data fusion are verified with the ground truth, and examined in terms of the accuracy. Results show that with the data fusion the accuracy are improved by about 15% in most of the study areas. The methodology for the detection of individual tree stands by data fusion is explored, and the utility of combinatorial use of the spectral and the height information is demonstrated.

  • PDF

Estimation of Above-Ground Biomass of a Tropical Forest in Northern Borneo Using High-resolution Satellite Image

  • Phua, Mui-How;Ling, Zia-Yiing;Wong, Wilson;Korom, Alexius;Ahmad, Berhaman;Besar, Normah A.;Tsuyuki, Satoshi;Ioki, Keiko;Hoshimoto, Keigo;Hirata, Yasumasa;Saito, Hideki;Takao, Gen
    • Journal of Forest and Environmental Science
    • /
    • v.30 no.2
    • /
    • pp.233-242
    • /
    • 2014
  • Estimating above-ground biomass is important in establishing an applicable methodology of Measurement, Reporting and Verification (MRV) System for Reducing Emissions from Deforestation and Forest Degradation-Plus (REDD+). We developed an estimation model of diameter at breast height (DBH) from IKONOS-2 image that led to above-ground biomass estimation (AGB). The IKONOS image was preprocessed with dark object subtraction and topographic effect correction prior to watershed segmentation for tree crown delineation. Compared to the field observation, the overall segmentation accuracy was 64%. Crown detection percent had a strong negative correlation to tree density. In addition, satellite-based crown area had the highest correlation with the field measured DBH. We then developed the DBH allometric model that explained 74% of the data variance. In average, the estimated DBH was very similar to the measured DBH as well as for AGB. Overall, this method can potentially be applied to estimate AGB over a relatively large and remote tropical forest in Northern Borneo.